Note to grader: please scroll to end of notebook for final capstone section.
Global warming threatens to irreversibly disrupt the fragile balance that exists in natural ecosystems which are vital to sustaining life on our planet.
Scientific evidence indicates that the primary contributor to global warming and rising temperatures is CO$_{2}$. This greenhouse gas contributes to the "Greenhouse Effect" whereby the sun's warmth is trapped in the lower atmosphere leading to rising temperatures. The excessive levels of CO$_{2}$ that lead to the Greenhouse effect are a consequence of human economic and industrial activity which produce significant amounts of the gas.
The energy industry contributes a significant amount of CO$_{2}$ through the burning of fossil fuels to generate energy, though cleaner energy sources and more sustainable policy approaches exist.
With our lens focused on the energy industry, understanding the sources of CO$_{2}$ and being able to forecast the trajectory of CO$_{2}$ production can be very useful in aiding public policy and industrial decision-making with respect to emissions reduction strategies that range from cleaner energy sources to incentives for reductions in CO$_{2}$
Question about the data in general:
- What is the nature of our data? (data types, timeframes, etc.)
- Do we have a comprehensive dataset?
- For example, do we have any missing values?
- If we are missing values, is the amount that is missing a sizeable portion such that it could affect our analysis?
- In particular, if we're missing data, how much of the missing data affects the set of variable that we're particularly interested in forecasting - in this case being natural gas emissions?
Questions about the trending:
- How are the different fuel type emissions changing over time?
- How does natural gas emissions trend in general, and how does it compare to other fuel types?
- Does it appear to follow a pattern? The answer could be helpful in our eventual modelling/forecasting steps. trending up, down, not a all? Do we notice any seasonality?
Employing data science techniques, we wish to explore whether it is feasible to develop a useful model that predicts Natural Gas Emissions for the next 12 months
To do this we'll need to explore modeling techniques, including ARIMA, and establishing approaches that:
- Test for stationary
- If necessary, transform or manipulate the dataset appropriately to introduce stationarity
- Techniques may include log transformation and/or differencing the series to account for any reoccuring patterns < - Which, in turn, will allow us to develop appropriate modeling techniques to produce an accurate forecast
Lastly, we'll need to test the efficacy of our model by comparing it to a hold out data set from the timeseries that we can use to develop accuracy measures such as RMSE or MAE.
Assuming we are able to develop a useful model, we can then use the results to develop a set of recommendations for policy makers with respect to the trajectory we expect for natural gas emissions over the next 12 months.
This datset is the past monthly data of Carbon dioxide emissions from electricity generation from the US Energy Information Administration categorized by fuel type such as Coal, Natural gas etc.
MSN:- Reference to Mnemonic Series Names (U.S. Energy Information Administration Nomenclature)
YYYYMM:- The month of the year on which these emissions were observed
Value:- Amount of CO2 Emissions in Million Metric Tons of Carbon Dioxide
Description:- Different category of electricity production through which carbon is emissioned.
This notebook can be considered a guide to refer to while solving the problem. The evaluation will be as per the Rubric shared for each Milestone. Unlike previous courses, it does not follow the pattern of the graded questions in different sections. This notebook would give you a direction on what steps need to be taken in order to get a viable solution to the problem. Please note that this is just one way of doing this. There can be other 'creative' ways to solve the problem and we urge you to feel free and explore them as an 'optional' exercise.
In the notebook, there are markdown cells called - Observations and Insights. It is a good practice to provide observations and extract insights from the outputs.
The naming convention for different variables can vary. Please consider the code provided in this notebook as a sample code.
All the outputs in the notebook are just for reference and can be different if you follow a different approach.
There are sections called Think About It in the notebook that will help you get a better understanding of the reasoning behind a particular technique/step. Interested learners can take alternative approaches if they want to explore different techniques.
#Import basic libraries
import pandas as pd
import warnings
import itertools
import numpy as np
import statsmodels.api as sm
# to render interactive / engaging visuals
import matplotlib.pyplot as plt
import seaborn as sns
import altair as alt
#To calculate the MSE or RMSE
from sklearn.metrics import mean_squared_error
#Importing acf and pacf functions
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
#Importing models from statsmodels library
from statsmodels.tsa.ar_model import AutoReg
from statsmodels.tsa.arima.model import ARIMA
# to evaluate dataset quickly
import pandas_profiling as pp
df = pd.read_excel('MER_T12_06.xlsx')
df.head()
| MSN | YYYYMM | Value | Description | |
|---|---|---|---|---|
| 0 | CLEIEUS | 197301 | 72.076 | Coal Electric Power Sector CO2 Emissions |
| 1 | CLEIEUS | 197302 | 64.442 | Coal Electric Power Sector CO2 Emissions |
| 2 | CLEIEUS | 197303 | 64.084 | Coal Electric Power Sector CO2 Emissions |
| 3 | CLEIEUS | 197304 | 60.842 | Coal Electric Power Sector CO2 Emissions |
| 4 | CLEIEUS | 197305 | 61.798 | Coal Electric Power Sector CO2 Emissions |
#to ignore warnings
import warnings
import itertools
warnings.filterwarnings("ignore")
#conversion of "YYYYMM" columnn into standard datetime format & making it as index
# We are using errors=’coerce’. It will replace all non-numeric values with NaN.
dateparse = lambda x: pd.to_datetime(x, format='%Y%m', errors = 'coerce')
df = pd.read_excel('MER_T12_06.xlsx', parse_dates=['YYYYMM'], index_col='YYYYMM', date_parser=dateparse)
df.sample(5)
| MSN | Value | Description | |
|---|---|---|---|
| YYYYMM | |||
| 1975-10-01 | NNEIEUS | 14.988 | Natural Gas Electric Power Sector CO2 Emissions |
| 2008-04-01 | PCEIEUS | 1.193 | Petroleum Coke Electric Power Sector CO2 Emiss... |
| 1977-07-01 | RFEIEUS | 27.08 | Residual Fuel Oil Electric Power Sector CO2 Em... |
| 2003-10-01 | NNEIEUS | 22.15 | Natural Gas Electric Power Sector CO2 Emissions |
| 1986-05-01 | NWEIEUS | Not Available | Non-Biomass Waste Electric Power Sector CO2 Em... |
The arguments can be explained as:
ts = df[pd.Series(pd.to_datetime(df.index, errors='coerce')).notnull().values]
ts.head()
| MSN | Value | Description | |
|---|---|---|---|
| YYYYMM | |||
| 1973-01-01 | CLEIEUS | 72.076 | Coal Electric Power Sector CO2 Emissions |
| 1973-02-01 | CLEIEUS | 64.442 | Coal Electric Power Sector CO2 Emissions |
| 1973-03-01 | CLEIEUS | 64.084 | Coal Electric Power Sector CO2 Emissions |
| 1973-04-01 | CLEIEUS | 60.842 | Coal Electric Power Sector CO2 Emissions |
| 1973-05-01 | CLEIEUS | 61.798 | Coal Electric Power Sector CO2 Emissions |
Medina Observations
- We ran pandas profofiling which revealed the followingL
- We observe that there are
384missing values for the "Value" column representing8.2%of the values.
#Check the datatypes of each column. Hint: Use dtypes method
ts.dtypes
MSN object Value object Description object dtype: object
#convert the emision value into numeric value
ts['Value'] = pd.to_numeric(ts['Value'], errors='coerce')
pd.DataFrame(ts).info()
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 4707 entries, 1973-01-01 to 2016-07-01 Data columns (total 3 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 MSN 4707 non-null object 1 Value 4323 non-null float64 2 Description 4707 non-null object dtypes: float64(1), object(2) memory usage: 147.1+ KB
Jose Medina Comments:¶
We see from .info() that we're missing 384 in the Values field
- That represents
384/4707 = 8.2%of the dataset- 384 is equivalent to removing 1 years worth of data from the data set - though it may be scatter across fuel types.
- Before blindly dropping the data, we should see what proportion of values are missing with respect to Natural Gas Emissions - because that is the Fuel Type we're specifically interested in.
# first map a more concise label to the dataset
new_col_dict = {'Coal Electric Power Sector CO2 Emissions': 'Coal',
'Distillate Fuel, Including Kerosene-Type Jet Fuel, Oil Electric Power Sector CO2 Emissions':'Distillate Fuel',
'Geothermal Energy Electric Power Sector CO2 Emissions':'Geothermal Energy',
'Natural Gas Electric Power Sector CO2 Emissions':'Natural Gas',
'Non-Biomass Waste Electric Power Sector CO2 Emissions':'Non-Biomass Waste',
'Petroleum Coke Electric Power Sector CO2 Emissions':'Petroleum Coke',
'Petroleum Electric Power Sector CO2 Emissions':'Petroleum',
'Residual Fuel Oil Electric Power Sector CO2 Emissions':'Residual Fuel Oil',
'Total Energy Electric Power Sector CO2 Emissions':'Total'}
ts['FuelType'] = ts['Description'].map(new_col_dict)
# then, printout the .info again, isolated to Natural Gas Emissions.
pd.DataFrame(ts[ts['FuelType']=='Natural Gas']).info()
<class 'pandas.core.frame.DataFrame'> DatetimeIndex: 523 entries, 1973-01-01 to 2016-07-01 Data columns (total 4 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 MSN 523 non-null object 1 Value 523 non-null float64 2 Description 523 non-null object 3 FuelType 523 non-null object dtypes: float64(1), object(3) memory usage: 20.4+ KB
Good, we see that for Natural Gas in particular, we are not missing any values¶
Jose Medina - Custom Section¶
- Running Pandas Profiling to develop a broader understanding, quickly, of:
- Missing values
- Distributions
- Basic correlations
pp.ProfileReport(ts)
#Drop the missing value using dropna(inplace = True)
ts.dropna(inplace=True)
ts_grouped = ts.groupby('Description')['Value'].sum().sort_values().reset_index()
cols = ['Geothermal Energy', 'Non-Biomass Waste', 'Petroleum Coke','Distillate Fuel ',
'Residual Fuel Oil', 'Petroleum', 'Natural Gas', 'Coal', 'Total Emissions']
ts_grouped['FuelType'] = cols
ts_grouped
| Description | Value | FuelType | |
|---|---|---|---|
| 0 | Geothermal Energy Electric Power Sector CO2 Em... | 10.563 | Geothermal Energy |
| 1 | Non-Biomass Waste Electric Power Sector CO2 Em... | 281.367 | Non-Biomass Waste |
| 2 | Petroleum Coke Electric Power Sector CO2 Emiss... | 338.785 | Petroleum Coke |
| 3 | Distillate Fuel, Including Kerosene-Type Jet F... | 404.887 | Distillate Fuel |
| 4 | Residual Fuel Oil Electric Power Sector CO2 Em... | 4239.312 | Residual Fuel Oil |
| 5 | Petroleum Electric Power Sector CO2 Emissions | 4982.993 | Petroleum |
| 6 | Natural Gas Electric Power Sector CO2 Emissions | 11295.359 | Natural Gas |
| 7 | Coal Electric Power Sector CO2 Emissions | 65782.393 | Coal |
| 8 | Total Energy Electric Power Sector CO2 Emissions | 82352.676 | Total Emissions |
'''
Let's exclude the "Total Electric Power Sector Field for now
so we can isolate and inspect the individual fuel types only
'''
ts_vis = ts.reset_index()
ts_vis = ts_vis[~ts_vis['FuelType'].str.contains('Total')].reset_index()
alt.Chart(ts_vis).mark_line().encode(
x='YYYYMM:T',
y='Value:Q',
color='FuelType',
strokeDash='FuelType',
tooltip=['YYYYMM', 'Value', 'Description']
).properties(
title='Fuel Type / CO2 Emissions Over Time',
width=800,
height=500
).configure_axis(
labelFontSize=20,
titleFontSize=20
).configure_legend(
titleFontSize=18,
labelFontSize=15
).interactive()
- It appears from the data that there may be some seasonality to the trending,
- Lets explore this quickly by creating a seasonal feature that might explain the saw-tooth pattern that we're seeing across the different fuels. We'll isolate this to just Coal and Natural Gas for now.
'''
Because this analysis is limited to US, we can make some simple assumptions
around seasons associated with certain months of the year
- Lets assume January, February, March, September, November, December = Winter
- Assume April, May, September, October = Spring/Fall
'''
ts_vis['Month'] = pd.to_datetime(ts_vis['YYYYMM']).dt.month
ts_vis['Season'] = np.where( ts_vis['Month'] <= 3, 'Winter',
np.where( ts_vis['Month'] <= 5, 'Spring/Fall',
np.where( ts_vis['Month'] <= 8, 'Summer',
np.where( ts_vis['Month'] <= 10, 'Spring/Fall',
np.where( ts_vis['Month'] <= 12, 'Winter',"Other")))))
for i in ['Coal','Natural Gas']:
temp = ts_vis[ts_vis['FuelType']==i]
plt.title(i + " Emissions by Season")
sns.boxplot(x="Season", y="Value", data=temp)
plt.show()
for i in ['Coal','Natural Gas']:
temp = ts_vis[ts_vis['FuelType']==i]
plt.title(i + " Emissions by Month")
sns.boxplot(x="Month", y="Value", data=temp)
plt.show()
for i in set(ts_vis['FuelType']):
temp = data=ts_vis[ts_vis['FuelType']==i]
ax = sns.lineplot(data=temp, x='YYYYMM', y='Value')
ax.set(title=i)
plt.ylabel('CO2 Emmissions')
plt.show()
Jose Medina Observations and Insights:¶
General Trending
- Coal usage exceeds all other fuel type emissions followed by natural gas
- Though coal dominates as a source of emissions it appears to be on a downward trend since ~2008 — It may be worthwhile to explore what, if any, policy or economic changes might have inspired a drop on or about that time.
- In contrast, we observe that Natural Gas has been steadily increasing In that same timeframe - with a marked increase starting on or about 2008.
- Additionally, Coal, Petroleum, Petroleum Coke, Residual Fuel Oil, and Distillate Fuel have all declined in emissions
- Meanwhile, Geothermal, Natural Gas have both increased, while non-biomass waste has held relatively flat over the past ~10 years.
Seasonal Trending
- Boxplot analysis to inspect seasonality of Natural Gas and Coal indicates that our top emissions periods occur in the Summer for both
- For Natural Gas in particular, we see seasons are Spring/Fall & Summer (in other words, hotter weather seems to inspire more fuel use)
CO2_per_source = ts.groupby('Description')['Value'].sum().sort_values().reset_index()
CO2_per_source
| Description | Value | |
|---|---|---|
| 0 | Geothermal Energy Electric Power Sector CO2 Em... | 10.563 |
| 1 | Non-Biomass Waste Electric Power Sector CO2 Em... | 281.367 |
| 2 | Petroleum Coke Electric Power Sector CO2 Emiss... | 338.785 |
| 3 | Distillate Fuel, Including Kerosene-Type Jet F... | 404.887 |
| 4 | Residual Fuel Oil Electric Power Sector CO2 Em... | 4239.312 |
| 5 | Petroleum Electric Power Sector CO2 Emissions | 4982.993 |
| 6 | Natural Gas Electric Power Sector CO2 Emissions | 11295.359 |
| 7 | Coal Electric Power Sector CO2 Emissions | 65782.393 |
| 8 | Total Energy Electric Power Sector CO2 Emissions | 82352.676 |
cols = ['Geothermal Energy', 'Non-Biomass Waste', 'Petroleum Coke','Distillate Fuel ',
'Residual Fuel Oil', 'Petroleum', 'Natural Gas', 'Coal', 'Total Emissions']
CO2_per_source['FuelType'] = cols
# Visualize barplot and exclude total sum - so we can see individual fuel types
sns.barplot(data=CO2_per_source[~CO2_per_source['Description'].str.contains('Total')], y='Value',x='FuelType')
plt.title('Cumulative CO2 by FuelType Since 1973 - US Electric Industry')
plt.xticks(rotation=45)
plt.show()
#Define a function to use adfuller test
def adfuller(df_train):
#Importing adfuller using statsmodels
from statsmodels.tsa.stattools import adfuller
print('Dickey-Fuller Test: ')
adftest = adfuller(df_train['close'])
adfoutput = pd.Series(adftest[0:4], index=['Test Statistic','p-value','Lags Used','No. of Observations'])
for key,value in adftest[4].items():
adfoutput['Critical Value (%s)'%key] = value
print(adfoutput)
df_mte = ts.iloc[:,1:] # Monthly total emissions (mte)
df_mte= df_mte.groupby(['Description', pd.Grouper(freq="M")])['Value'].sum().unstack(level = 0)
df_mte = df_mte['Natural Gas Electric Power Sector CO2 Emissions'] # monthly total emissions (mte)
df_mte
YYYYMM
1973-01-31 12.175
1973-02-28 11.708
1973-03-31 13.994
1973-04-30 14.627
1973-05-31 17.344
...
2016-03-31 40.525
2016-04-30 39.763
2016-05-31 44.210
2016-06-30 53.567
2016-07-31 62.881
Freq: M, Name: Natural Gas Electric Power Sector CO2 Emissions, Length: 523, dtype: float64
Jose Medina Comments
- Let's test the stationary of the series by exploring some rolling means and standard deviations
- Let's first split out a training / test portion so we can validate the modelling later
# Splitting the data into train and test
df_test = df_mte.loc['2015-07-31':'2016-07-31']
df_train = df_mte.loc['1973-01-31':'2015-07-30']
print(df_train.head(5))
print(df_test.head(5))
YYYYMM 1973-01-31 12.175 1973-02-28 11.708 1973-03-31 13.994 1973-04-30 14.627 1973-05-31 17.344 Freq: M, Name: Natural Gas Electric Power Sector CO2 Emissions, dtype: float64 YYYYMM 2015-07-31 57.712 2015-08-31 56.662 2015-09-30 49.384 2015-10-31 43.680 2015-11-30 40.394 Freq: M, Name: Natural Gas Electric Power Sector CO2 Emissions, dtype: float64
# Calculating the rolling mean and standard deviation for a window of 12 observations
rolmean=df_train.rolling(window=12).mean()
rolstd=df_train.rolling(window=12).std()
#Visualizing the rolling mean and standard deviation
plt.figure(figsize=(8,3))
actual = plt.plot(df_train, color='c', label='Actual Series')
rollingmean = plt.plot(rolmean, color='red', label='Rolling Mean')
rollingstd = plt.plot(rolstd, color='green', label='Rolling Std. Dev.')
plt.title('Rolling Mean & Standard Deviation of the Series')
plt.legend()
plt.show()
Jose Medina Observations & insights:¶
- We can see there is an upward trend in the series in the moving average.
- However, the standard deviation is almost constant which implies that now the series has constant variance.
- We can confirm that the series is not stationary.
- We can also use the Augmented Dickey-Fuller (ADF) Test to verify if the series is stationary or not. The null and alternate hypotheses for the ADF Test are defined as:
Null hypothesis: The Time Series is non-stationary Alternative hypothesis: The Time Series is stationary
def adfuller(df_train):
#Importing adfuller using statsmodels
from statsmodels.tsa.stattools import adfuller
print('Dickey-Fuller Test: ')
adftest = adfuller(df_train)
adfoutput = pd.Series(adftest[0:4], index=['Test Statistic','p-value','Lags Used','No. of Observations'])
for key,value in adftest[4].items():
adfoutput['Critical Value (%s)'%key] = value
print(adfoutput)
temp = pd.DataFrame(df_train).reset_index()
adfuller(temp['Natural Gas Electric Power Sector CO2 Emissions'])
Dickey-Fuller Test: Test Statistic 1.252864 p-value 0.996332 Lags Used 19.000000 No. of Observations 490.000000 Critical Value (1%) -3.443766 Critical Value (5%) -2.867457 Critical Value (10%) -2.569921 dtype: float64
Jose Medina Observations:¶
- From the above test, we can see that the p-value = 1 i.e. > 0.05 (For 95% confidence intervals) therefore, we fail to reject the null hypothesis.
- Hence, we can confirm that the series is non-stationary.
Jose Medina Proposed approach¶
Potential techniques
- We now know that the series is not stationary so we'll need work with the data to accomplish this task
- Techniques may include log transformation and/or differencing the series
- In the next milestone, we should attempt to decompose the time series into Trend, Seasonality, and Residual to attempt to make it stationary and predictable.
- Using plots such as acf, and pacf will help us understand auto-correlation and partial auto-correlation approaches.
Overall solution design
- We'll need to go through the iterative steps of model development to be certain of the solution design, but, most likely:
- We will likely build an ARIMA model that takes range of dates as an input
- Somehow, we'll need to be considerate of seasonality (for example - peak usage of Natural Gas in Summer)
- We may need to consider the potential for an acceleration in the use of alternative/renewable resources that are NOT reflected in this dataset. This would serve as an offset to Natural Gas, as well as all other fossil bases resources. This may be difficult to estimate, but we could try to add a "bias" or "what-if" component to the forecast
Measures of success
- We'll want to measure success on both quantitative as well as qualitative aspects Quantitative Criteria
- We will want to compare our model forecasts for 12 months
- We will do this by evaluating the RMSE.
Qualitative Criteria
- We will want to ensure that we can explain the model in an intuitive manner and that the client can interact with it in a useful manner
- We will want to propose some interactivity/what-if scenario capability in the notebook or in a complemetary application that helps the client obtain deeper and iterative insights.
- Generate energy policy recommendations based on the insights of the model
Jose Medina - Milestone 2 BEGINS HERE
</font>
For developing the time series model and forecasting, you are expected to use the natural gas CO2 emission from the electrical power generation. We need to slice this data:
###Slice the data to get the monthly total CO2 emissions of Natural Gas Electric Power Sector
df_mte = ts.iloc[:,1:] # Monthly total emissions (mte)
df_mte= df_mte.groupby(['Description', pd.Grouper(freq="M")])['Value'].sum().unstack(level = 0)
df_mte = df_mte['Natural Gas Electric Power Sector CO2 Emissions'] # monthly total emissions (mte)
#Check 1st few rows of data
df_mte.head(5)
YYYYMM 1973-01-31 12.175 1973-02-28 11.708 1973-03-31 13.994 1973-04-30 14.627 1973-05-31 17.344 Freq: M, Name: Natural Gas Electric Power Sector CO2 Emissions, dtype: float64
# Split the data into train and test
# Again - I already demonstrated this in milestone #1, but will repeat here.
# Splitting the data into train and test - using 12 months as TEST
df_test = df_mte.loc['2014-08-31':'2016-07-31']
df_train = df_mte.loc['1973-01-31':'2014-08-30']
print(df_train.head(5))
print(df_test.head(5))
YYYYMM 1973-01-31 12.175 1973-02-28 11.708 1973-03-31 13.994 1973-04-30 14.627 1973-05-31 17.344 Freq: M, Name: Natural Gas Electric Power Sector CO2 Emissions, dtype: float64 YYYYMM 2014-08-31 48.871 2014-09-30 41.961 2014-10-31 38.286 2014-11-30 32.703 2014-12-31 34.800 Freq: M, Name: Natural Gas Electric Power Sector CO2 Emissions, dtype: float64
#Import the required package
import statsmodels
import statsmodels.api as sm
from statsmodels.tsa.stattools import coint, adfuller
# Calculating the rolling mean and standard deviation for a window of 12 observations
rolmean=df_train.rolling(window=12).mean()
rolstd=df_train.rolling(window=12).std()
#Visualizing the rolling mean and standard deviation
plt.figure(figsize=(8,3))
actual = plt.plot(df_train, color='c', label='Actual Series')
rollingmean = plt.plot(rolmean, color='red', label='Rolling Mean')
rollingstd = plt.plot(rolstd, color='green', label='Rolling Std. Dev.')
plt.title('Rolling Mean & Standard Deviation of the Series')
plt.legend()
plt.show()
Jose Medina Observations & insights:¶
- We can see there is an upward trend in the series in the moving average.
- However, the standard deviation is almost constant which implies that now the series has constant variance.
- We can confirm that the series is not stationary.
- We can also use the Augmented Dickey-Fuller (ADF) Test to verify if the series is stationary or not. The null and alternate hypotheses for the ADF Test are defined as:
Null hypothesis: The Time Series is non-stationary Alternative hypothesis: The Time Series is stationary
Use the Augmented Dickey-Fuller (ADF) Test to verify if the series is stationary or not. The null and alternate hypotheses for the ADF Test are defined as:
- Null hypothesis: The Time Series is non-stationary
- Alternative hypothesis: The Time Series is stationary
#Define a function to use adfuller test
def adfuller(df_train):
#Importing adfuller using statsmodels
from statsmodels.tsa.stattools import adfuller
print('Dickey-Fuller Test: ')
adftest = adfuller(df_train)
adfoutput = pd.Series(adftest[0:4], index=['Test Statistic','p-value','Lags Used','No. of Observations'])
for key,value in adftest[4].items():
adfoutput['Critical Value (%s)'%key] = value
print(adfoutput)
temp = pd.DataFrame(df_train).reset_index()
adfuller(temp['Natural Gas Electric Power Sector CO2 Emissions'])
Dickey-Fuller Test: Test Statistic 0.435097 p-value 0.982761 Lags Used 15.000000 No. of Observations 483.000000 Critical Value (1%) -3.443962 Critical Value (5%) -2.867543 Critical Value (10%) -2.569967 dtype: float64
Jose Medina Observations:¶
- From the above test, we can see that the p-value = 1 i.e. > 0.05 (For 95% confidence intervals) therefore, we fail to reject the null hypothesis.
- Hence, we can confirm that the series is non-stationary.
We can use some of the following methods to convert a non-stationary series into a stationary one:
We take the average of ‘k’ consecutive values depending on the frequency of time series (in this capstone 12 months).
Here, we will take the average over the past 1 year.
# Visualize the rolling mean and standard deviation after using log transformation
plt.figure(figsize=(16,8))
df_log = np.log(df_train)
MAvg = df_log.rolling(window=12).mean()
MStd = df_log.rolling(window=12).std()
plt.plot(df_log)
plt.plot(MAvg, color='r', label = 'Moving Average')
plt.plot(MStd, color='g', label = 'Standard Deviation')
plt.legend()
plt.show()
Jose Medina Observations and Insights:¶
- We can still see an upward trend in the series, we can conclude that the series is still non-stationary.
- The standard deviation is almost constant which implies that now the series has constant variance.
Let's shift the series by order 1 (or by 1 month) & apply differencing (using lagged series) and then check the rolling mean and standard deviation.
Visualize the rolling mean and rolling standard deviation of the shifted series (df_shift) and check the stationarity by calling the adfuller() function. Also, write your observations on the same.
##Code here
# Let's shift the series by order 1 (or by 1 month) & apply differencing (using lagged series) and then check the rolling mean and standard deviation.
plt.figure(figsize=(16,8))
df_shift = df_log - df_log.shift(periods = 1)
MAvg_shift = df_shift.rolling(window=12).mean()
MStd_shift = df_shift.rolling(window=12).std()
plt.plot(df_shift, color='c')
plt.plot(MAvg_shift, color='red', label = 'Moving Average')
plt.plot(MStd_shift, color='green', label = 'Standard Deviation')
plt.xticks(fontsize=20)
plt.yticks(fontsize=20)
plt.legend()
plt.show()
#Dropping the null values that we get after applying differencing method
df_shift = df_shift.dropna()
Jose Medina - Observations and Insights:¶
- The mean and the standard deviation seem to be constant over time. It appears we may have made our series stationary! Let us use the adfuller test to check the stationarity, justo to make sure.
adfuller(df_shift)
Dickey-Fuller Test: Test Statistic -4.951428 p-value 0.000028 Lags Used 18.000000 No. of Observations 479.000000 Critical Value (1%) -3.444076 Critical Value (5%) -2.867593 Critical Value (10%) -2.569994 dtype: float64
Jose Medina Observations and Insights:
- From the above test, we can see that the p-value < .05 (For 95% confidence intervals) therefore, we reject the null hypothesis.
- Hence, we can confirm that the series is stationary.
- Let's decompose the time series to check its different components.
#Importing the seasonal_decompose function to decompose the time series
from statsmodels.tsa.seasonal import seasonal_decompose
decomp = seasonal_decompose(df_train)
trend = decomp.trend
seasonal = decomp.seasonal
residual = decomp.resid
plt.figure(figsize=(15,10))
plt.subplot(411)
plt.plot(df_train, label='Actual', marker='.')
plt.legend(loc='upper left')
plt.subplot(412)
plt.plot(trend, label='Trend', marker='.')
plt.legend(loc='upper left')
plt.subplot(413)
plt.plot(seasonal, label='Seasonality', marker='.')
plt.legend(loc='upper left')
plt.subplot(414)
plt.plot(residual, label='Residuals', marker='.')
plt.legend(loc='upper left')
plt.tight_layout()
Jose Medina Observations and Insights¶
- We can see that there are significant trend, seasonality and residuals components in the series
- The plot for seasonality shows that emissions definitely spikes in the summer months (as we were able to show earlier with the boxplots in Milestone #1).
Plot the auto-correlation function and partial auto-correlation function to get p and q values for AR, MA, ARMA, and ARIMA models
Plot the ACF and PACF charts and find the optimal parameters
from statsmodels.graphics.tsaplots import plot_acf,plot_pacf
plt.figure(figsize = (16,8))
plot_acf(df_shift, lags = 12)
plt.show()
plot_pacf(df_shift, lags = 12)
plt.show()
<Figure size 1152x576 with 0 Axes>
Jose Medina Observations and Insights:¶
- From the above PACF plot we can see that the highest lag at which the plot extends beyond the statistically significant boundary is lag 12.
- This indicates that an AR Model of lag 12 (p=12) should be sufficient to fit the data.
- Similarly, from the ACF plot, we can infer that q=12.
Order p is the lag value after which the PACF plot crosses the upper confidence interval for the first time. These p lags will act as our features while forecasting the AR time series.
Fit and predict the shifted series with the AR Model and calculate the RMSE. Also, visualize the time series and write your observations.
#Importing AutoReg function to apply AR model
from statsmodels.tsa.ar_model import AutoReg
#Code here
plt.figure(figsize=(16,8))
model_AR = AutoReg(df_shift, lags=12) #Using number of lags as 12
results_AR = model_AR.fit()
plt.plot(df_shift)
predict = results_AR.predict(start=0,end=len(df_shift)-1)
predict = predict.fillna(0) #Converting NaN values to 0
plt.plot(predict, color='red')
plt.title('AR Model - RMSE: %.4f'% mean_squared_error(predict,df_shift.values, squared=False)) #Calculating rmse
plt.show()
Jose Medina Observations & Insights:¶
- We can see that by using the AR model, we get root mean squared error (RMSE) = 0.0890
- Let's check the AIC value of the models as well.
- The Akaike Information Critera (AIC) is a widely used measure of a statistical model. It basically quantifies 1) the goodness of fit, and 2) the simplicity/parsimony, of the model into a single statistic. When comparing two models, the one with the lower AIC is generally “better”.
results_AR.aic
# Now, let's build MA, ARMA, and ARIMA models as well and see if we can get a better model
-962.2248455356314
Order q of the MA process is obtained from the ACF plot, this is the lag after which ACF crosses the upper confidence interval for the first time.
Fit and predict the shifted series with the MA Model and calculate the RMSE. Also, visualize the time series and write your observations.
plt.figure(figsize=(16,8))
model_MA = ARIMA(df_shift, order=(0, 0, 12)) #Using p=0, d=0, q=12
results_MA = model_MA.fit()
plt.plot(df_shift)
plt.plot(results_MA.fittedvalues, color='red')
plt.title('MA Model - RMSE: %.4f'% mean_squared_error(results_MA.fittedvalues,df_shift.values, squared=False))
plt.show()
results_MA.aic
-713.4637439626873
Jose Medina Observations & Insights:¶
- Note that for the MA model, lag 12 results is the lowest RMSE of
0.1146across any other (lag=1 for example produced a poorer RMSE)- Note that we earlier observed that the AR model had a root mean squared error (RMSE) =
0.0889(better than MA model)- We also note that the AIC for the AR model is '-962.2' which is slightly lower & better than AIC of
-713.4for the MA model The AR model is giving a lower AIC when compared to the MA model, implying that the AR model fits the training data better.
We will be using the above AR lag(P) & MA lag(Q) as a paramter and d=0 in ARIMA so that it will work as an ARMA model.
Fit and predict the shifted series with the ARMA Model and calculate the RMSE. Also, visualize the time series and write your observations.
plt.figure(figsize=(16,8))
model_ARMA = ARIMA(df_shift, order=(12, 0, 12)) #Using p=12, d=0, q=12
results_ARMA = model_ARMA.fit()
plt.plot(df_shift)
plt.plot(results_ARMA.fittedvalues, color='red')
plt.title('ARMA Model - RMSE: %.4f'% mean_squared_error(results_ARMA.fittedvalues,df_shift.values, squared=False))
plt.show()
Jose Medina Observations & Insights:¶
- The ARMA model appears to be better - giving lower RMSE (.0810) as compared to the AR and MA models.
Check the AIC value of the model
results_ARMA.aic
-1039.1693401817988
Fit and predict the shifted series with the ARIMA Model and calculate the RMSE. Also, visualize the time series and write your observations.
from statsmodels.tsa.arima.model import ARIMA
# Lets do a very simple grid search for the i parameter
# We iterate form 0 to 12 to evaluate the RMSE of each index value
# The we populate a dataframe that we can print and filter on for the smallest RMSE
for i in range(13):
model_ARIMA = ARIMA(df_shift, order=(12,i,12)) #Using p=12, d=1, q=12
results_ARIMA = model_ARIMA.fit()
val = mean_squared_error(results_ARIMA.fittedvalues,df_shift.values, squared=False)
print(f'd={i} ARIMA Model - RMSE: %.4f'% val)
d=0 ARIMA Model - RMSE: 0.0810 d=1 ARIMA Model - RMSE: 0.0804 d=2 ARIMA Model - RMSE: 0.0895 d=3 ARIMA Model - RMSE: 0.1104 d=4 ARIMA Model - RMSE: 0.1700 d=5 ARIMA Model - RMSE: 0.2187 d=6 ARIMA Model - RMSE: 0.2801 d=7 ARIMA Model - RMSE: 0.4022 d=8 ARIMA Model - RMSE: 0.4755 d=9 ARIMA Model - RMSE: 0.8634 d=10 ARIMA Model - RMSE: 1.2954 d=11 ARIMA Model - RMSE: 0.1630 d=12 ARIMA Model - RMSE: 2.5977
Jose Medina Observations¶
- We see that d = 1, and produces the lowest RMSE
- That means this is essentially an ARMA model (as there's no integrated term to consider)
# Here we use p=12, d=1, q=12, d=1 produces the lowest RMSE
plt.figure(figsize=(16,8))
model_ARIMA = ARIMA(df_shift, order=(12,1,12))
results_ARIMA = model_ARIMA.fit()
plt.plot(df_shift, label='Original Shifted Series')
plt.plot(results_ARIMA.fittedvalues, color='red', label='ARIMA Fitted Series')
plt.title('ARIMA Model - RMSE: %.4f'% mean_squared_error(results_ARIMA.fittedvalues,df_shift.values, squared=False))
plt.legend()
plt.show()
Check the AIC value of the model
results_ARIMA.aic
-1044.392329991282
# Printing the fitted values
predictions=pd.Series(results_ARIMA.fittedvalues)
predictions
YYYYMM
1973-02-28 0.000000
1973-03-31 -0.039112
1973-04-30 0.143220
1973-05-31 0.053570
1973-06-30 0.122697
...
2014-03-31 0.032002
2014-04-30 -0.014171
2014-05-31 0.142663
2014-06-30 0.182609
2014-07-31 0.227404
Freq: M, Length: 498, dtype: float64
Jose Medina Observations¶
- We see that this ARIMA model has an AIC of
-1044.4which is consistent with ARMA model we've produced earlier
- I wish to be able to control the years that are plotted, this could be useful in ongoing exploratory analysis. To accomodate this, we'll do the following:
- We'll convert Series into dataframes
- We'll add a variable that allows me to set the year cut-off - right now we set it to 1972 (which means all values will be included, none will be filtered based on year
Use the correct inverse transformation depending on the model chosen to get back the original values.
Apply an inverse transformation on the predictions of the chosen model
# let's get our df_log into a dataframe format so we can manipulate it a bit easier.
df_log_temp = pd.DataFrame(df_log).reset_index()
df_log_temp.columns = ['YYYYMM','values']
df_log_temp[pd.to_datetime(df_log_temp['YYYYMM']).dt.year > 1972]
df_log_temp.head()
| YYYYMM | values | |
|---|---|---|
| 0 | 1973-01-31 | 2.499385 |
| 1 | 1973-02-28 | 2.460272 |
| 2 | 1973-03-31 | 2.638629 |
| 3 | 1973-04-30 | 2.682869 |
| 4 | 1973-05-31 | 2.853247 |
#Add the code blocks based on the requirements
#First step - doing cumulative sum
predictions_cumsum = predictions.cumsum() # use .cumsum fuction on the predictions
print("\n\n 1st step: predictions_cumsum")
print(predictions_cumsum)
#Second step - Adding the first value of the log series to the cumulative sum values
#predictions_log = pd.Series(df_log_temp.iloc[0], index=df_log_temp.index)
predictions_log = pd.Series(df_log.values[0], index=df_log.index)
print(df_log.values[0])
predictions_log = predictions_log.add(predictions_cumsum, fill_value=0)
print("\n\n 2nd step: predictions_log")
print(predictions_log)
#Third step - applying exponential transformation
predictions_ARIMA = np.exp(predictions_log) #use exponential function
print("\n\n 3rd step: predictions_ARIMA")
predictions_ARIMA
1st step: predictions_cumsum
YYYYMM
1973-02-28 0.000000
1973-03-31 -0.039112
1973-04-30 0.104108
1973-05-31 0.157678
1973-06-30 0.280375
...
2014-03-31 0.209206
2014-04-30 0.195036
2014-05-31 0.337699
2014-06-30 0.520308
2014-07-31 0.747712
Freq: M, Length: 498, dtype: float64
2.4993846689686534
2nd step: predictions_log
YYYYMM
1973-01-31 2.499385
1973-02-28 2.499385
1973-03-31 2.460272
1973-04-30 2.603493
1973-05-31 2.657063
...
2014-03-31 2.708591
2014-04-30 2.694420
2014-05-31 2.837084
2014-06-30 3.019693
2014-07-31 3.247097
Freq: M, Length: 499, dtype: float64
3rd step: predictions_ARIMA
YYYYMM
1973-01-31 12.175000
1973-02-28 12.175000
1973-03-31 11.708000
1973-04-30 13.510844
1973-05-31 14.254364
...
2014-03-31 15.008114
2014-04-30 14.796940
2014-05-31 17.065927
2014-06-30 20.485002
2014-07-31 25.715576
Freq: M, Length: 499, dtype: float64
predictions_ARIMA_temp = pd.DataFrame(predictions_ARIMA).reset_index()
predictions_ARIMA_temp.columns = ['YYYYMM','Values']
print(predictions_ARIMA_temp['YYYYMM'].count())
predictions_ARIMA_temp['Series'] = 'Prediction'
predictions_ARIMA_temp.head()
499
| YYYYMM | Values | Series | |
|---|---|---|---|
| 0 | 1973-01-31 | 12.175000 | Prediction |
| 1 | 1973-02-28 | 12.175000 | Prediction |
| 2 | 1973-03-31 | 11.708000 | Prediction |
| 3 | 1973-04-30 | 13.510844 | Prediction |
| 4 | 1973-05-31 | 14.254364 | Prediction |
df_train_temp = pd.DataFrame(df_train).reset_index()
df_train_temp.columns = ['YYYYMM','Values']
print(df_train_temp['YYYYMM'].count())
df_train_temp['Series'] = 'Original Train'
df_train_temp.head()
499
| YYYYMM | Values | Series | |
|---|---|---|---|
| 0 | 1973-01-31 | 12.175 | Original Train |
| 1 | 1973-02-28 | 11.708 | Original Train |
| 2 | 1973-03-31 | 13.994 | Original Train |
| 3 | 1973-04-30 | 14.627 | Original Train |
| 4 | 1973-05-31 | 17.344 | Original Train |
# Here we filter both dataframes to be > 1975
predictions_ARIMA_temp = predictions_ARIMA_temp[pd.to_datetime(predictions_ARIMA_temp['YYYYMM']).dt.year > 1972]
df_train_temp = df_train_temp[pd.to_datetime(df_train_temp['YYYYMM']).dt.year > 1972]
Plot the original vs predicted series
df_full = predictions_ARIMA_temp.append(df_train_temp)
sns.lineplot(data=df_full, x="YYYYMM", y="Values", hue='Series')
<AxesSubplot:xlabel='YYYYMM', ylabel='Values'>
Jose Medina Observations & Insights:¶
- We can see that the predicted series is similar to the original series up to around 1990. i.e. the model is good at predicting values
- Except, when we get to 1990+ we started to see a separation, and the model does not seem to perform well
- This trend coincides with the timing of the Gas Shale boom that occurred on or about 2000 to 2010.
Note: though we will proceed in milestone #2 to develop predictions based on this current implementation, it is clear from the visual that forecasts will likely be poor. We will need to alter our design to ensure it is informed by the latest trending that appears to begin in the year 2000 and beyond.
I will add further commentary on this in the last section of this notebook
#Forecasting the values for next 24 months
forecasted_ARIMA = results_ARIMA.forecast(steps=24) # here steps represent the number of months
forecasted_ARIMA
2014-08-31 0.050231 2014-09-30 -0.187578 2014-10-31 -0.182866 2014-11-30 -0.064848 2014-12-31 0.074093 2015-01-31 0.003712 2015-02-28 -0.094845 2015-03-31 -0.020041 2015-04-30 -0.038771 2015-05-31 0.154630 2015-06-30 0.145917 2015-07-31 0.190509 2015-08-31 0.040137 2015-09-30 -0.192760 2015-10-31 -0.165089 2015-11-30 -0.069809 2015-12-31 0.070854 2016-01-31 0.011568 2016-02-29 -0.095183 2016-03-31 -0.032100 2016-04-30 -0.027600 2016-05-31 0.144449 2016-06-30 0.151935 2016-07-31 0.187314 Freq: M, Name: predicted_mean, dtype: float64
#Creating a series of cumulative sum
forecasted_cumsum = forecasted_ARIMA.cumsum()
forecasted_cumsum
#Making a new dataframe to get the the indices from 2020-2021
index = pd.date_range('2014-08-31','2016-07-31' , freq='1M')
df1 = pd.DataFrame()
df1['cumsum'] = forecasted_cumsum
df1.index = index
#Adding last value of the log of the training data
df1['Forecasted'] = df1['cumsum'] + float(df_log_temp.iloc[-1].values[1])
df1
| cumsum | Forecasted | |
|---|---|---|
| 2014-08-31 | 0.050231 | 3.876718 |
| 2014-09-30 | -0.137347 | 3.689140 |
| 2014-10-31 | -0.320213 | 3.506274 |
| 2014-11-30 | -0.385061 | 3.441426 |
| 2014-12-31 | -0.310968 | 3.515519 |
| 2015-01-31 | -0.307256 | 3.519231 |
| 2015-02-28 | -0.402101 | 3.424386 |
| 2015-03-31 | -0.422142 | 3.404345 |
| 2015-04-30 | -0.460913 | 3.365574 |
| 2015-05-31 | -0.306283 | 3.520204 |
| 2015-06-30 | -0.160367 | 3.666120 |
| 2015-07-31 | 0.030142 | 3.856629 |
| 2015-08-31 | 0.070279 | 3.896766 |
| 2015-09-30 | -0.122481 | 3.704006 |
| 2015-10-31 | -0.287570 | 3.538917 |
| 2015-11-30 | -0.357378 | 3.469109 |
| 2015-12-31 | -0.286524 | 3.539963 |
| 2016-01-31 | -0.274956 | 3.551531 |
| 2016-02-29 | -0.370139 | 3.456348 |
| 2016-03-31 | -0.402239 | 3.424248 |
| 2016-04-30 | -0.429839 | 3.396648 |
| 2016-05-31 | -0.285390 | 3.541097 |
| 2016-06-30 | -0.133455 | 3.693032 |
| 2016-07-31 | 0.053859 | 3.880346 |
#Applying exponential transformation to the forecasted log values
forecasted_ARIMA = np.exp(df1['Forecasted'])
forecasted_ARIMA
2014-08-31 48.265541 2014-09-30 40.010404 2014-10-31 33.323857 2014-11-30 31.231459 2014-12-31 33.633380 2015-01-31 33.758471 2015-02-28 30.703791 2015-03-31 30.094569 2015-04-30 28.950108 2015-05-31 33.791305 2015-06-30 39.099917 2015-07-31 47.305627 2015-08-31 49.242945 2015-09-30 40.609675 2015-10-31 34.429617 2015-11-30 32.108110 2015-12-31 34.465646 2016-01-31 34.866667 2016-02-29 31.701004 2016-03-31 30.699549 2016-04-30 29.863830 2016-05-31 34.504734 2016-06-30 40.166432 2016-07-31 48.440966 Freq: M, Name: Forecasted, dtype: float64
#Plotting the original vs predicted series and Forecast
# Filtering to year 2000 and beyond to improve readability.
plt.figure(figsize=(16,8))
plt.plot(forecasted_ARIMA.loc['2000-01-01':], label = '24 Month Prediction', color='b')
plt.plot(df_train.loc['2000-01-01':], label = 'Train', color='y')
plt.plot(df_test.loc['2000-01-01':], label = 'Test', color='r')
plt.title('Actual vs Predicted vs. Train', fontsize=20)
plt.xticks(fontsize=20)
plt.yticks(fontsize=20)
plt.legend(prop={'size': 20})
plt.show()
print('Test Data RMSE: %.4f'% mean_squared_error(forecasted_ARIMA.values,df_test.values, squared=False))
print('Train Data RMSE: %.4f'% mean_squared_error(predictions_ARIMA_temp.Values,df_train.values, squared=False))
Test Data RMSE: 8.1462 Train Data RMSE: 11.0983
Jose Medina - Conclusions from the above analysis¶
Contrary to our observation, the RMSE is higher on the training data in comparison to the testing data. After year ~2000, we see a shift in the behavior of the Natural Gas emissions. This may be due to the known boom that occurred in gas shale production. Our model hasn't learnt this from the pattern of previous years and hence the predicted values are not as close as we would like to the the actual values.
Jose Medina Proposed Approach¶
Refined insights:¶
In the original data set
- • We can see there is an upward trend in the series in the moving average.
- • However, the standard deviation is almost constant which implies that now the series has constant variance.
- • We can confirm that the series is not stationary.
- • This was confirmed with Augmented Dickey-Fuller (ADF) Test to verify if the series was not stationary
- • We proceeded to introduce log and differencing to make the data set stations and we confirmed it was stationary with the ADF.
- • Seasonal Decomposition and ACF/PACF clearly showed that a 12 month lag and moving show the highest correlation and should be used as p, q parameters. -• We performed a very simple grid search for the remaining ARIMA parameter (d) for the integration term and determined optimal d=1.
Comparison of various techniques and their relative performance:¶
- We know, after having evaluated AR, MA, ARMA, and ARIMA models that ARIMA performs best with the lowest RMSE and AIC. However, it performed only moderately better than ARMA, and AR models. The worst of all techniques was MA.
- MA Model: 0.1146 RMSE, -713.4 AIC
- AR Model: 0.0889 RMSE, -962.2 AIC
- ARMA Model: 0.0810 RMSE, -1039.17 AIC
- ARIMA Model: .0804 RMSE, -1044.39 AIC
I look forward to implementing this in the next milestone!
Jose Medina - Final Milestone BEGINS HERE
We will pick up where we left off last time following up on the recommendations above.
- In this final submission we need to accomplish the following:
- Step 1: Reduce the time-series to 2000+ onward utilizing the timeseries data from year 2000+ onward
- Step 2: Split into train/test
- Step 3: Explore modelling using SARIMA techniques to accomodate the seasonality inherent in the series, ensuring we use a robust grid search for hyperparemter tuning
- Step 4: We should evaluate the performance of the model comparing it to previous methods
- Step 5: Lastly, we should strive to create some what-if scenarios to facilitate decision-making
- Step 6: Present conclusions, recommendations, and next steps
'''
> - Step 1: Reduce the time-series to 2000+ onward utilizing the timeseries data from year 2000+ onward
> - Step 2: Split into train/test
'''
df_2000 = df_mte.loc['2000-01-31':'2016-07-31']
# Splitting the data into train and test - using 12 months as TEST
df_test = df_2000.loc['2014-08-31':'2016-07-31']
df_train = df_2000.loc['2000-01-31':'2014-08-30']
print(df_train.head(5))
print(df_test.head(5))
YYYYMM 2000-01-31 17.571 2000-02-29 15.405 2000-03-31 19.158 2000-04-30 19.809 2000-05-31 28.502 Freq: M, Name: Natural Gas Electric Power Sector CO2 Emissions, dtype: float64 YYYYMM 2014-08-31 48.871 2014-09-30 41.961 2014-10-31 38.286 2014-11-30 32.703 2014-12-31 34.800 Freq: M, Name: Natural Gas Electric Power Sector CO2 Emissions, dtype: float64
# Visualize the rolling mean and standard deviation after using log transformation
plt.figure(figsize=(16,8))
df_log = np.log(df_train)
MAvg = df_log.rolling(window=12).mean()
MStd = df_log.rolling(window=12).std()
plt.plot(df_log)
plt.plot(MAvg, color='r', label = 'Moving Average')
plt.plot(MStd, color='g', label = 'Standard Deviation')
plt.legend()
plt.show()
Jose Medina Observations and Insights:¶
- We can still see an upward trend in the series, we can conclude that the series is still non-stationary.
- The standard deviation is almost constant which implies that now the series has constant variance.
##Code here
plt.figure(figsize=(16,8))
df_shift = df_log - df_log.shift(periods = 1)
MAvg_shift = df_shift.rolling(window=12).mean()
MStd_shift = df_shift.rolling(window=12).std()
plt.plot(df_shift, color='c')
plt.plot(MAvg_shift, color='red', label = 'Moving Average')
plt.plot(MStd_shift, color='green', label = 'Standard Deviation')
plt.xticks(fontsize=20)
plt.yticks(fontsize=20)
plt.legend()
plt.show()
#Dropping the null values that we get after applying differencing method
df_shift = df_shift.dropna()
Jose Medina - Observations and Insights:¶
- The mean and the standard deviation seem to be constant over time. It appears we may have made our series stationary!
- Let us use the adfuller test to check the stationarity, justo to make sure.
adfuller(df_shift)
Dickey-Fuller Test: Test Statistic -4.254157 p-value 0.000533 Lags Used 14.000000 No. of Observations 159.000000 Critical Value (1%) -3.472161 Critical Value (5%) -2.879895 Critical Value (10%) -2.576557 dtype: float64
Jose Medina Observations and Insights:
- From the above test, we can see that the p-value < .05 (For 95% confidence intervals) therefore, we reject the null hypothesis.
- Hence, we can confirm that the series is stationary.
- Let's decompose the time series to check its different components.
### **Elimination of trend and seasonality: Decomposition**
decomp = seasonal_decompose(df_train)
trend = decomp.trend
seasonal = decomp.seasonal
residual = decomp.resid
plt.figure(figsize=(15,10))
plt.subplot(411)
plt.plot(df_train, label='Actual', marker='.')
plt.legend(loc='upper left')
plt.subplot(412)
plt.plot(trend, label='Trend', marker='.')
plt.legend(loc='upper left')
plt.subplot(413)
plt.plot(seasonal, label='Seasonality', marker='.')
plt.legend(loc='upper left')
plt.subplot(414)
plt.plot(residual, label='Residuals', marker='.')
plt.legend(loc='upper left')
plt.tight_layout()
Jose Medina Observations and Insights¶
- We can see that there are significant trend, seasonality and residuals components in the series
- The plot for seasonality shows that emissions definitely spikes in the summer months (as we were able to show earlier with the boxplots in Milestone #1).
Jose Medina - SARIMA Implementation¶
We will use the implementation of SARIMA provided by the statsmodels library.
This model has hyperparameters that control the nature of the model performed for the series, trend and seasonality, specifically:
- order: A tuple p, d, and q parameters for the modeling of the trend.
- sesonal_order: A tuple of P, D, Q, and m parameters for the modeling the seasonality
trend: A parameter for controlling a model of the deterministic trend as one of ‘n’,’c’,’t’,’ct’ for no trend, constant, linear, and constant with > linear trend, respectively
Also recall that the non-seasonal (pure ARIMA) portion of this model must be defined as well
- p: the order of the autoregressive process
- d: the degree of differencing (number of times it was differenced)
- q: the order of the moving average process
'''
Courtesy of: https://machinelearningmastery.com/how-to-grid-search-sarima-model-hyperparameters-for-time-series-forecasting-in-python/
We can start-off by defining a function that will fit a model with a given configuration and make a one-step forecast.
The sarima_forecast() below implements this behavior.
The function takes an array or list of contiguous prior observations and a list of configuration parameters used to configure the model, specifically two tuples and a string for the trend order, seasonal order trend, and parameter.
We also try to make the model robust by relaxing constraints, such as that the data must be stationary and that the MA transform be invertible.
'''
from math import sqrt
from multiprocessing import cpu_count
from joblib import Parallel
from joblib import delayed
from warnings import catch_warnings
from warnings import filterwarnings
from statsmodels.tsa.statespace.sarimax import SARIMAX
from sklearn.metrics import mean_squared_error
def sarima_forecast(history, config):
order, sorder, trend = config
# define model
model = SARIMAX(history, order=order, seasonal_order=sorder, trend=trend, enforce_stationarity=False, enforce_invertibility=False)
# fit model
model_fit = model.fit(disp=False)
aic = model_fit.aic
# make one step forecast
yhat = model_fit.predict(len(history), len(history))
return yhat[0], aic
# split a univariate dataset into train/test sets
def train_test_split(data, n_test):
return data[:-n_test], data[-n_test:]
# root mean squared error or rmse
def measure_rmse(actual, predicted):
return sqrt(mean_squared_error(actual, predicted))
def walk_forward_validation(data, n_test, cfg):
predictions = list()
# split dataset
train, test = train_test_split(data, n_test)
# seed history with training dataset
history = [x for x in train]
# step over each time-step in the test set
for i in range(len(test)):
# fit model and make forecast for history
yhat, aic = sarima_forecast(history, cfg)
# store forecast in list of predictions
predictions.append(yhat)
# add actual observation to history for the next loop
history.append(test[i])
# estimate prediction error
error = measure_rmse(test, predictions)
return error, aic
'''We can trap exceptions and ignore warnings during the grid search by wrapping all
calls to walk_forward_validation() with a try-except and a
block to ignore warnings. We can also add debugging support
to disable these protections in the case we want to see what is really going on.
'''
# score a model, return None on failure
def score_model(data, n_test, cfg, debug=False):
result = None
# convert config to a key
key = str(cfg)
# show all warnings and fail on exception if debugging
if debug:
result = walk_forward_validation(data, n_test, cfg)
else:
# one failure during model validation suggests an unstable config
try:
# never show warnings when grid searching, too noisy
with catch_warnings():
filterwarnings("ignore")
result, aic = walk_forward_validation(data, n_test, cfg)
except:
error = None
# check for an interesting result
if result is not None:
print(' > Model[%s] - RMSE %.3f | AIC %.3f' % (key, result, aic))
return (key, result)
# grid search configs
'''
- We can define a Parallel object with the number of cores to use and set it to the number of scores detected in your hardware.
- We can then can then create a list of tasks to execute in parallel, which will be one call to the score_model() function for each model configuration we have.
- Finally, we can use the Parallel object to execute the list of tasks in parallel.
'''
def grid_search(data, cfg_list, n_test, parallel=False):
scores = None
if parallel:
# execute configs in parallel
executor = Parallel(n_jobs=cpu_count(), backend='multiprocessing')
tasks = (delayed(score_model)(data, n_test, cfg) for cfg in cfg_list)
scores = executor(tasks)
else:
scores = [score_model(data, n_test, cfg) for cfg in cfg_list]
# remove empty results
scores = [r for r in scores if r[1] != None]
# sort configs by error, asc
scores.sort(key=lambda tup: tup[1])
return scores
# create a set of sarima configs to try
# limiting to p_params to 1 because we know from prior AR/MA analysis that these are significant
# eliminated the term 'n' from t_params because I know from prior iteration on gridsearch that best term is
def sarima_configs(seasonal=[0]):
models = list()
# define config lists
p_params = [0, 1]
d_params = [0, 1]
q_params = [0, 1, 2]
t_params = ['t'] # ['n','c','t','ct']
P_params = [0, 1, 2]
D_params = [0, 1]
Q_params = [0, 1, 2]
m_params = seasonal
# create config instances
for p in p_params:
for d in d_params:
for q in q_params:
for t in t_params:
for P in P_params:
for D in D_params:
for Q in Q_params:
for m in m_params:
cfg = [(p,d,q), (P,D,Q,m), t]
models.append(cfg)
return models
- Next, the only thing left to do is to define a list of model configurations to try for a dataset.
'''
> - Step 3: Explore modelling using SARIMA techniques to accomodate the seasonality inherent in the series, ensuring we use a robust grid search for hyperparemter tuning
'''
if __name__ == '__main__':
# load dataset
series = df_2000 # This is the series we filtered to earlier, containaing time series data just for Year 2000+
data = series.values
# trim dataset to 5 years
data = data[-(5*12):]
# data split
# We'll split off 24 months so that we can truly compare apples to apples in the ARIMA model from Milestone #2
n_test = 24
# model configs
cfg_list = sarima_configs(seasonal=[0, 12])
# grid search
scores = grid_search(data, cfg_list, n_test)
print('done')
# list top 3 configs
for cfg, error in scores[:3]:
print(cfg, error)
> Model[[(0, 0, 0), (0, 0, 0, 0), 'c']] - RMSE 9.039 | AIC 401.262 > Model[[(0, 0, 0), (0, 0, 0, 12), 'c']] - RMSE 9.039 | AIC 401.262 > Model[[(0, 0, 0), (0, 0, 1, 12), 'c']] - RMSE 7.220 | AIC 298.451 > Model[[(0, 0, 0), (0, 0, 2, 12), 'c']] - RMSE 6.625 | AIC 222.821 > Model[[(0, 0, 0), (0, 1, 0, 12), 'c']] - RMSE 5.628 | AIC 282.866 > Model[[(0, 0, 0), (0, 1, 1, 12), 'c']] - RMSE 4.713 | AIC 1350.117 > Model[[(0, 0, 0), (0, 1, 2, 12), 'c']] - RMSE 3.243 | AIC 112.104 > Model[[(0, 0, 0), (1, 0, 0, 12), 'c']] - RMSE 5.894 | AIC 286.111 > Model[[(0, 0, 0), (1, 0, 1, 12), 'c']] - RMSE 15.865 | AIC 281.851 > Model[[(0, 0, 0), (1, 0, 2, 12), 'c']] - RMSE 4.136 | AIC 183.197 > Model[[(0, 0, 0), (1, 1, 0, 12), 'c']] - RMSE 4.628 | AIC 202.079 > Model[[(0, 0, 0), (1, 1, 1, 12), 'c']] - RMSE 65.191 | AIC 1412.815 > Model[[(0, 0, 0), (1, 1, 2, 12), 'c']] - RMSE 3.436 | AIC 107.267 > Model[[(0, 0, 0), (2, 0, 0, 12), 'c']] - RMSE 4.922 | AIC 203.260 > Model[[(0, 0, 0), (2, 0, 1, 12), 'c']] - RMSE 8.516 | AIC 208.763 > Model[[(0, 0, 0), (2, 0, 2, 12), 'c']] - RMSE 4.282 | AIC 207.689 > Model[[(0, 0, 0), (2, 1, 0, 12), 'c']] - RMSE 3.661 | AIC 113.895 > Model[[(0, 0, 0), (2, 1, 1, 12), 'c']] - RMSE 4.615 | AIC 964.578 > Model[[(0, 0, 0), (2, 1, 2, 12), 'c']] - RMSE 6.922 | AIC 104.818 > Model[[(0, 0, 0), (0, 0, 0, 0), 't']] - RMSE 17.227 | AIC 509.573 > Model[[(0, 0, 0), (0, 0, 0, 12), 't']] - RMSE 17.227 | AIC 509.573 > Model[[(0, 0, 0), (0, 0, 1, 12), 't']] - RMSE 13.798 | AIC 354.625 > Model[[(0, 0, 0), (0, 0, 2, 12), 't']] - RMSE 7.359 | AIC 232.556 > Model[[(0, 0, 0), (0, 1, 0, 12), 't']] - RMSE 4.987 | AIC 275.616 > Model[[(0, 0, 0), (0, 1, 1, 12), 't']] - RMSE 3.901 | AIC 1545.762 > Model[[(0, 0, 0), (0, 1, 2, 12), 't']] - RMSE 3.057 | AIC 105.007 > Model[[(0, 0, 0), (1, 0, 0, 12), 't']] - RMSE 3.776 | AIC 268.368 > Model[[(0, 0, 0), (1, 0, 1, 12), 't']] - RMSE 25.233 | AIC 259.377 > Model[[(0, 0, 0), (1, 0, 2, 12), 't']] - RMSE 3.268 | AIC 168.408 > Model[[(0, 0, 0), (1, 1, 0, 12), 't']] - RMSE 4.119 | AIC 194.135 > Model[[(0, 0, 0), (1, 1, 1, 12), 't']] - RMSE 88.019 | AIC 1608.461 > Model[[(0, 0, 0), (1, 1, 2, 12), 't']] - RMSE 3.459 | AIC 98.066 > Model[[(0, 0, 0), (2, 0, 0, 12), 't']] - RMSE 3.684 | AIC 183.020 > Model[[(0, 0, 0), (2, 0, 1, 12), 't']] - RMSE 14.700 | AIC 174.164 > Model[[(0, 0, 0), (2, 0, 2, 12), 't']] - RMSE 3.320 | AIC 168.958 > Model[[(0, 0, 0), (2, 1, 0, 12), 't']] - RMSE 3.434 | AIC 106.501 > Model[[(0, 0, 0), (2, 1, 1, 12), 't']] - RMSE 5.449 | AIC 952.890 > Model[[(0, 0, 0), (2, 1, 2, 12), 't']] - RMSE 5.976 | AIC 98.857 > Model[[(0, 0, 0), (0, 0, 0, 0), 'ct']] - RMSE 8.026 | AIC 397.254 > Model[[(0, 0, 0), (0, 0, 0, 12), 'ct']] - RMSE 8.026 | AIC 397.254 > Model[[(0, 0, 0), (0, 0, 1, 12), 'ct']] - RMSE 6.726 | AIC 290.028 > Model[[(0, 0, 0), (0, 0, 2, 12), 'ct']] - RMSE 4.887 | AIC 206.514 > Model[[(0, 0, 0), (0, 1, 0, 12), 'ct']] - RMSE 3.947 | AIC 265.557 > Model[[(0, 0, 0), (0, 1, 1, 12), 'ct']] - RMSE 4.308 | AIC 1335.245 > Model[[(0, 0, 0), (0, 1, 2, 12), 'ct']] - RMSE 2.595 | AIC 98.695 > Model[[(0, 0, 0), (1, 0, 0, 12), 'ct']] - RMSE 4.448 | AIC 269.826 > Model[[(0, 0, 0), (1, 0, 1, 12), 'ct']] - RMSE 15.388 | AIC 259.926 > Model[[(0, 0, 0), (1, 0, 2, 12), 'ct']] - RMSE 3.012 | AIC 191.353 > Model[[(0, 0, 0), (1, 1, 0, 12), 'ct']] - RMSE 3.562 | AIC 183.127 > Model[[(0, 0, 0), (1, 1, 1, 12), 'ct']] - RMSE 7.375 | AIC 1397.941 > Model[[(0, 0, 0), (1, 1, 2, 12), 'ct']] - RMSE 3.206 | AIC 82.677 > Model[[(0, 0, 0), (2, 0, 0, 12), 'ct']] - RMSE 3.808 | AIC 183.196 > Model[[(0, 0, 0), (2, 0, 1, 12), 'ct']] - RMSE 8.956 | AIC 192.868 > Model[[(0, 0, 0), (2, 0, 2, 12), 'ct']] - RMSE 3.542 | AIC 162.864 > Model[[(0, 0, 0), (2, 1, 0, 12), 'ct']] - RMSE 3.219 | AIC 91.910 > Model[[(0, 0, 0), (2, 1, 1, 12), 'ct']] - RMSE 6.263 | AIC 955.165 > Model[[(0, 0, 0), (2, 1, 2, 12), 'ct']] - RMSE 5.736 | AIC 82.842 > Model[[(0, 0, 1), (0, 0, 0, 0), 'c']] - RMSE 5.986 | AIC 351.108 > Model[[(0, 0, 1), (0, 0, 0, 12), 'c']] - RMSE 5.986 | AIC 351.108 > Model[[(0, 0, 1), (0, 0, 1, 12), 'c']] - RMSE 5.175 | AIC 257.858 > Model[[(0, 0, 1), (0, 0, 2, 12), 'c']] - RMSE 4.254 | AIC 189.267 > Model[[(0, 0, 1), (0, 1, 0, 12), 'c']] - RMSE 3.612 | AIC 234.637 > Model[[(0, 0, 1), (0, 1, 1, 12), 'c']] - RMSE 665.776 | AIC 1392.612 > Model[[(0, 0, 1), (0, 1, 2, 12), 'c']] - RMSE 2.418 | AIC 89.083 > Model[[(0, 0, 1), (1, 0, 0, 12), 'c']] - RMSE 3.467 | AIC 241.513 > Model[[(0, 0, 1), (1, 0, 1, 12), 'c']] - RMSE 9.379 | AIC 233.584 > Model[[(0, 0, 1), (1, 0, 2, 12), 'c']] - RMSE 2.499 | AIC 163.605 > Model[[(0, 0, 1), (1, 1, 0, 12), 'c']] - RMSE 2.975 | AIC 170.272 > Model[[(0, 0, 1), (1, 1, 1, 12), 'c']] - RMSE 119.311 | AIC 1453.525 > Model[[(0, 0, 1), (1, 1, 2, 12), 'c']] - RMSE 2.826 | AIC 88.914 > Model[[(0, 0, 1), (2, 0, 0, 12), 'c']] - RMSE 3.248 | AIC 175.116 > Model[[(0, 0, 1), (2, 0, 1, 12), 'c']] - RMSE 5.574 | AIC 168.748 > Model[[(0, 0, 1), (2, 0, 2, 12), 'c']] - RMSE 3.339 | AIC 179.262 > Model[[(0, 0, 1), (2, 1, 0, 12), 'c']] - RMSE 3.937 | AIC 105.003 > Model[[(0, 0, 1), (2, 1, 1, 12), 'c']] - RMSE 26.834 | AIC 966.564 > Model[[(0, 0, 1), (2, 1, 2, 12), 'c']] - RMSE 2.969 | AIC 93.972 > Model[[(0, 0, 1), (0, 0, 0, 0), 't']] - RMSE 9.811 | AIC 435.862 > Model[[(0, 0, 1), (0, 0, 0, 12), 't']] - RMSE 9.811 | AIC 435.862 > Model[[(0, 0, 1), (0, 0, 1, 12), 't']] - RMSE 9727.949 | AIC 298.592 > Model[[(0, 0, 1), (0, 0, 2, 12), 't']] - RMSE 4.789 | AIC 193.515 > Model[[(0, 0, 1), (0, 1, 0, 12), 't']] - RMSE 3.290 | AIC 228.939 > Model[[(0, 0, 1), (0, 1, 1, 12), 't']] - RMSE 5.779 | AIC 1312.584 > Model[[(0, 0, 1), (0, 1, 2, 12), 't']] - RMSE 2.457 | AIC 84.928 > Model[[(0, 0, 1), (1, 0, 0, 12), 't']] - RMSE 2.660 | AIC 229.653 > Model[[(0, 0, 1), (1, 0, 1, 12), 't']] - RMSE 1536.048 | AIC 211.783 > Model[[(0, 0, 1), (1, 0, 2, 12), 't']] - RMSE 2.192 | AIC 141.946 > Model[[(0, 0, 1), (1, 1, 0, 12), 't']] - RMSE 2.743 | AIC 168.379 > Model[[(0, 0, 1), (1, 1, 1, 12), 't']] - RMSE 24.256 | AIC 1373.497 > Model[[(0, 0, 1), (1, 1, 2, 12), 't']] - RMSE 2.627 | AIC 84.182 > Model[[(0, 0, 1), (2, 0, 0, 12), 't']] - RMSE 2.675 | AIC 159.693 > Model[[(0, 0, 1), (2, 0, 1, 12), 't']] - RMSE 103.780 | AIC 153.019 > Model[[(0, 0, 1), (2, 0, 2, 12), 't']] - RMSE 2.352 | AIC 143.927 > Model[[(0, 0, 1), (2, 1, 0, 12), 't']] - RMSE 3.501 | AIC 99.654 > Model[[(0, 0, 1), (2, 1, 1, 12), 't']] - RMSE 4.673 | AIC 966.578 > Model[[(0, 0, 1), (2, 1, 2, 12), 't']] - RMSE 2.813 | AIC 87.914 > Model[[(0, 0, 1), (0, 0, 0, 0), 'ct']] - RMSE 5.372 | AIC 348.037 > Model[[(0, 0, 1), (0, 0, 0, 12), 'ct']] - RMSE 5.372 | AIC 348.037 > Model[[(0, 0, 1), (0, 0, 1, 12), 'ct']] - RMSE 4.820 | AIC 251.330 > Model[[(0, 0, 1), (0, 0, 2, 12), 'ct']] - RMSE 3.642 | AIC 177.512 > Model[[(0, 0, 1), (0, 1, 0, 12), 'ct']] - RMSE 2.696 | AIC 217.781 > Model[[(0, 0, 1), (0, 1, 1, 12), 'ct']] - RMSE 3.884 | AIC 1257.592 > Model[[(0, 0, 1), (0, 1, 2, 12), 'ct']] - RMSE 2.585 | AIC 83.945 > Model[[(0, 0, 1), (1, 0, 0, 12), 'ct']] - RMSE 2.883 | AIC 231.392 > Model[[(0, 0, 1), (1, 0, 1, 12), 'ct']] - RMSE 9.236 | AIC 218.398 > Model[[(0, 0, 1), (1, 0, 2, 12), 'ct']] - RMSE 3.350 | AIC 145.643 > Model[[(0, 0, 1), (1, 1, 0, 12), 'ct']] - RMSE 2.649 | AIC 160.349 > Model[[(0, 0, 1), (1, 1, 1, 12), 'ct']] - RMSE 61.181 | AIC 1318.460 > Model[[(0, 0, 1), (1, 1, 2, 12), 'ct']] - RMSE 2.712 | AIC 81.882 > Model[[(0, 0, 1), (2, 0, 0, 12), 'ct']] - RMSE 2.793 | AIC 159.737 > Model[[(0, 0, 1), (2, 0, 1, 12), 'ct']] - RMSE 5.779 | AIC 169.182 > Model[[(0, 0, 1), (2, 0, 2, 12), 'ct']] - RMSE 4.008 | AIC 134.611 > Model[[(0, 0, 1), (2, 1, 0, 12), 'ct']] - RMSE 2.969 | AIC 90.787 > Model[[(0, 0, 1), (2, 1, 1, 12), 'ct']] - RMSE 5.119 | AIC 968.578 > Model[[(0, 0, 1), (2, 1, 2, 12), 'ct']] - RMSE 2.528 | AIC 80.789 > Model[[(0, 0, 2), (0, 0, 0, 0), 'c']] - RMSE 5.185 | AIC 333.705 > Model[[(0, 0, 2), (0, 0, 0, 12), 'c']] - RMSE 5.185 | AIC 333.705 > Model[[(0, 0, 2), (0, 0, 1, 12), 'c']] - RMSE 8.926 | AIC 243.148 > Model[[(0, 0, 2), (0, 0, 2, 12), 'c']] - RMSE 3.916 | AIC 179.400 > Model[[(0, 0, 2), (0, 1, 0, 12), 'c']] - RMSE 3.633 | AIC 221.368 > Model[[(0, 0, 2), (0, 1, 1, 12), 'c']] - RMSE 3.739 | AIC 1256.675 > Model[[(0, 0, 2), (0, 1, 2, 12), 'c']] - RMSE 2.771 | AIC 84.372 > Model[[(0, 0, 2), (1, 0, 0, 12), 'c']] - RMSE 3.364 | AIC 234.112 > Model[[(0, 0, 2), (1, 0, 1, 12), 'c']] - RMSE 8.385 | AIC 225.822 > Model[[(0, 0, 2), (1, 0, 2, 12), 'c']] - RMSE 3.403 | AIC 171.550 > Model[[(0, 0, 2), (1, 1, 0, 12), 'c']] - RMSE 3.358 | AIC 171.164 > Model[[(0, 0, 2), (1, 1, 1, 12), 'c']] - RMSE 146.448 | AIC 1315.831 > Model[[(0, 0, 2), (1, 1, 2, 12), 'c']] - RMSE 2.721 | AIC 86.213 > Model[[(0, 0, 2), (2, 0, 0, 12), 'c']] - RMSE 3.270 | AIC 172.986 > Model[[(0, 0, 2), (2, 0, 1, 12), 'c']] - RMSE 8.877 | AIC 177.180 > Model[[(0, 0, 2), (2, 0, 2, 12), 'c']] - RMSE 3.151 | AIC 152.634 > Model[[(0, 0, 2), (2, 1, 0, 12), 'c']] - RMSE 3.929 | AIC 106.414 > Model[[(0, 0, 2), (2, 1, 1, 12), 'c']] - RMSE 4.201 | AIC 1074.995 > Model[[(0, 0, 2), (2, 1, 2, 12), 'c']] - RMSE 5.549 | AIC 89.531 > Model[[(0, 0, 2), (0, 0, 0, 0), 't']] - RMSE 7.307 | AIC 399.406 > Model[[(0, 0, 2), (0, 0, 0, 12), 't']] - RMSE 7.307 | AIC 399.406 > Model[[(0, 0, 2), (0, 0, 1, 12), 't']] - RMSE 1253.366 | AIC 268.891 > Model[[(0, 0, 2), (0, 0, 2, 12), 't']] - RMSE 4.963 | AIC 180.826 > Model[[(0, 0, 2), (0, 1, 0, 12), 't']] - RMSE 3.321 | AIC 217.408 > Model[[(0, 0, 2), (0, 1, 1, 12), 't']] - RMSE 2.856 | AIC 1318.530 > Model[[(0, 0, 2), (0, 1, 2, 12), 't']] - RMSE 2.811 | AIC 82.224 > Model[[(0, 0, 2), (1, 0, 0, 12), 't']] - RMSE 2.834 | AIC 227.210 > Model[[(0, 0, 2), (1, 0, 1, 12), 't']] - RMSE 31.275 | AIC 208.269 > Model[[(0, 0, 2), (1, 0, 2, 12), 't']] - RMSE 3.189 | AIC 142.105 > Model[[(0, 0, 2), (1, 1, 0, 12), 't']] - RMSE 3.135 | AIC 166.489 > Model[[(0, 0, 2), (1, 1, 1, 12), 't']] - RMSE 109.526 | AIC 1377.585 > Model[[(0, 0, 2), (1, 1, 2, 12), 't']] - RMSE 2.890 | AIC 82.825 > Model[[(0, 0, 2), (2, 0, 0, 12), 't']] - RMSE 2.751 | AIC 160.180 > Model[[(0, 0, 2), (2, 0, 1, 12), 't']] - RMSE 53.947 | AIC 153.844 > Model[[(0, 0, 2), (2, 0, 2, 12), 't']] - RMSE 2.929 | AIC 146.506 > Model[[(0, 0, 2), (2, 1, 0, 12), 't']] - RMSE 3.550 | AIC 101.548 > Model[[(0, 0, 2), (2, 1, 1, 12), 't']] - RMSE 4.056 | AIC 999.860 > Model[[(0, 0, 2), (2, 1, 2, 12), 't']] - RMSE 2.581 | AIC 83.821 > Model[[(0, 0, 2), (0, 0, 0, 0), 'ct']] - RMSE 4.785 | AIC 331.314 > Model[[(0, 0, 2), (0, 0, 0, 12), 'ct']] - RMSE 4.785 | AIC 331.314 > Model[[(0, 0, 2), (0, 0, 1, 12), 'ct']] - RMSE 8.832 | AIC 238.596 > Model[[(0, 0, 2), (0, 0, 2, 12), 'ct']] - RMSE 3.916 | AIC 171.472 > Model[[(0, 0, 2), (0, 1, 0, 12), 'ct']] - RMSE 2.758 | AIC 209.543 > Model[[(0, 0, 2), (0, 1, 1, 12), 'ct']] - RMSE 4.267 | AIC 1323.959 > Model[[(0, 0, 2), (0, 1, 2, 12), 'ct']] - RMSE 2.723 | AIC 83.080 > Model[[(0, 0, 2), (1, 0, 0, 12), 'ct']] - RMSE 3.035 | AIC 229.593 > Model[[(0, 0, 2), (1, 0, 1, 12), 'ct']] - RMSE 8.363 | AIC 211.822 > Model[[(0, 0, 2), (1, 0, 2, 12), 'ct']] - RMSE 3.226 | AIC 169.732 > Model[[(0, 0, 2), (1, 1, 0, 12), 'ct']] - RMSE 2.637 | AIC 159.641 > Model[[(0, 0, 2), (1, 1, 1, 12), 'ct']] - RMSE 25.070 | AIC 1383.086 > Model[[(0, 0, 2), (1, 1, 2, 12), 'ct']] - RMSE 3.129 | AIC 79.858 > Model[[(0, 0, 2), (2, 0, 0, 12), 'ct']] - RMSE 2.691 | AIC 159.939 > Model[[(0, 0, 2), (2, 0, 1, 12), 'ct']] - RMSE 8.408 | AIC 162.155 > Model[[(0, 0, 2), (2, 0, 2, 12), 'ct']] - RMSE 3.827 | AIC 161.536 > Model[[(0, 0, 2), (2, 1, 0, 12), 'ct']] - RMSE 3.148 | AIC 86.761 > Model[[(0, 0, 2), (2, 1, 1, 12), 'ct']] - RMSE 4.892 | AIC 953.128 > Model[[(0, 0, 2), (2, 1, 2, 12), 'ct']] - RMSE 3.284 | AIC 83.299 > Model[[(0, 1, 0), (0, 0, 0, 0), 'c']] - RMSE 5.183 | AIC 348.158 > Model[[(0, 1, 0), (0, 0, 0, 12), 'c']] - RMSE 5.183 | AIC 348.158 > Model[[(0, 1, 0), (0, 0, 1, 12), 'c']] - RMSE 4.367 | AIC 252.646 > Model[[(0, 1, 0), (0, 0, 2, 12), 'c']] - RMSE 3.303 | AIC 178.839 > Model[[(0, 1, 0), (0, 1, 0, 12), 'c']] - RMSE 2.559 | AIC 215.099 > Model[[(0, 1, 0), (0, 1, 1, 12), 'c']] - RMSE 2.469 | AIC 1248.137 > Model[[(0, 1, 0), (0, 1, 2, 12), 'c']] - RMSE 2.171 | AIC 84.847 > Model[[(0, 1, 0), (1, 0, 0, 12), 'c']] - RMSE 2.591 | AIC 216.508 > Model[[(0, 1, 0), (1, 0, 1, 12), 'c']] - RMSE 2.739 | AIC 207.225 > Model[[(0, 1, 0), (1, 0, 2, 12), 'c']] - RMSE 1.906 | AIC 142.408 > Model[[(0, 1, 0), (1, 1, 0, 12), 'c']] - RMSE 2.226 | AIC 153.433 > Model[[(0, 1, 0), (1, 1, 1, 12), 'c']] - RMSE 2.422 | AIC 1205.966 > Model[[(0, 1, 0), (1, 1, 2, 12), 'c']] - RMSE 2.945 | AIC 86.788 > Model[[(0, 1, 0), (2, 0, 0, 12), 'c']] - RMSE 2.342 | AIC 152.792 > Model[[(0, 1, 0), (2, 0, 1, 12), 'c']] - RMSE 2.113 | AIC 148.492 > Model[[(0, 1, 0), (2, 0, 2, 12), 'c']] - RMSE 2.229 | AIC 144.288 > Model[[(0, 1, 0), (2, 1, 0, 12), 'c']] - RMSE 3.198 | AIC 91.116 > Model[[(0, 1, 0), (2, 1, 1, 12), 'c']] - RMSE 3.053 | AIC 846.745 > Model[[(0, 1, 0), (2, 1, 2, 12), 'c']] - RMSE 5.011 | AIC 87.398 > Model[[(0, 1, 0), (0, 0, 0, 0), 't']] - RMSE 5.243 | AIC 347.940 > Model[[(0, 1, 0), (0, 0, 0, 12), 't']] - RMSE 5.243 | AIC 347.940 > Model[[(0, 1, 0), (0, 0, 1, 12), 't']] - RMSE 4.427 | AIC 252.530 > Model[[(0, 1, 0), (0, 0, 2, 12), 't']] - RMSE 3.368 | AIC 178.712 > Model[[(0, 1, 0), (0, 1, 0, 12), 't']] - RMSE 2.584 | AIC 215.075 > Model[[(0, 1, 0), (0, 1, 1, 12), 't']] - RMSE 2.565 | AIC 1263.901 > Model[[(0, 1, 0), (0, 1, 2, 12), 't']] - RMSE 2.078 | AIC 84.929 > Model[[(0, 1, 0), (1, 0, 0, 12), 't']] - RMSE 2.607 | AIC 216.378 > Model[[(0, 1, 0), (1, 0, 1, 12), 't']] - RMSE 2.787 | AIC 206.887 > Model[[(0, 1, 0), (1, 0, 2, 12), 't']] - RMSE 1.936 | AIC 142.589 > Model[[(0, 1, 0), (1, 1, 0, 12), 't']] - RMSE 2.282 | AIC 153.482 > Model[[(0, 1, 0), (1, 1, 1, 12), 't']] - RMSE 2.421 | AIC 1221.734 > Model[[(0, 1, 0), (1, 1, 2, 12), 't']] - RMSE 3.847 | AIC 86.993 > Model[[(0, 1, 0), (2, 0, 0, 12), 't']] - RMSE 2.369 | AIC 152.636 > Model[[(0, 1, 0), (2, 0, 1, 12), 't']] - RMSE 2.158 | AIC 148.657 > Model[[(0, 1, 0), (2, 0, 2, 12), 't']] - RMSE 2.297 | AIC 144.494 > Model[[(0, 1, 0), (2, 1, 0, 12), 't']] - RMSE 3.282 | AIC 91.620 > Model[[(0, 1, 0), (2, 1, 1, 12), 't']] - RMSE 2.796 | AIC 932.631 > Model[[(0, 1, 0), (2, 1, 2, 12), 't']] - RMSE 4.061 | AIC 89.026 > Model[[(0, 1, 0), (0, 0, 0, 0), 'ct']] - RMSE 5.290 | AIC 349.878 > Model[[(0, 1, 0), (0, 0, 0, 12), 'ct']] - RMSE 5.290 | AIC 349.878 > Model[[(0, 1, 0), (0, 0, 1, 12), 'ct']] - RMSE 4.490 | AIC 254.504 > Model[[(0, 1, 0), (0, 0, 2, 12), 'ct']] - RMSE 4.110 | AIC 180.705 > Model[[(0, 1, 0), (0, 1, 0, 12), 'ct']] - RMSE 2.695 | AIC 216.377 > Model[[(0, 1, 0), (0, 1, 1, 12), 'ct']] - RMSE 2.895 | AIC 1372.442 > Model[[(0, 1, 0), (0, 1, 2, 12), 'ct']] - RMSE 2.717 | AIC 86.843 > Model[[(0, 1, 0), (1, 0, 0, 12), 'ct']] - RMSE 2.740 | AIC 217.406 > Model[[(0, 1, 0), (1, 0, 1, 12), 'ct']] - RMSE 2.827 | AIC 207.581 > Model[[(0, 1, 0), (1, 0, 2, 12), 'ct']] - RMSE 2.221 | AIC 144.427 > Model[[(0, 1, 0), (1, 1, 0, 12), 'ct']] - RMSE 2.700 | AIC 155.433 > Model[[(0, 1, 0), (1, 1, 1, 12), 'ct']] - RMSE 2.614 | AIC 1330.275 > Model[[(0, 1, 0), (1, 1, 2, 12), 'ct']] - RMSE 3.940 | AIC 88.945 > Model[[(0, 1, 0), (2, 0, 0, 12), 'ct']] - RMSE 2.718 | AIC 154.579 > Model[[(0, 1, 0), (2, 0, 1, 12), 'ct']] - RMSE 2.496 | AIC 150.444 > Model[[(0, 1, 0), (2, 0, 2, 12), 'ct']] - RMSE 2.775 | AIC 146.717 > Model[[(0, 1, 0), (2, 1, 0, 12), 'ct']] - RMSE 3.645 | AIC 92.375 > Model[[(0, 1, 0), (2, 1, 1, 12), 'ct']] - RMSE 2.968 | AIC 847.719 > Model[[(0, 1, 0), (2, 1, 2, 12), 'ct']] - RMSE 4.104 | AIC 91.123 > Model[[(0, 1, 1), (0, 0, 0, 0), 'c']] - RMSE 4.695 | AIC 332.209 > Model[[(0, 1, 1), (0, 0, 0, 12), 'c']] - RMSE 4.695 | AIC 332.209 > Model[[(0, 1, 1), (0, 0, 1, 12), 'c']] - RMSE 3.957 | AIC 241.190 > Model[[(0, 1, 1), (0, 0, 2, 12), 'c']] - RMSE 3.244 | AIC 172.802 > Model[[(0, 1, 1), (0, 1, 0, 12), 'c']] - RMSE 2.578 | AIC 212.050 > Model[[(0, 1, 1), (0, 1, 1, 12), 'c']] - RMSE 2.720 | AIC 1209.143 > Model[[(0, 1, 1), (0, 1, 2, 12), 'c']] - RMSE 2.095 | AIC 80.140 > Model[[(0, 1, 1), (1, 0, 0, 12), 'c']] - RMSE 2.575 | AIC 217.010 > Model[[(0, 1, 1), (1, 0, 1, 12), 'c']] - RMSE 2.722 | AIC 204.003 > Model[[(0, 1, 1), (1, 0, 2, 12), 'c']] - RMSE 2.123 | AIC 138.891 > Model[[(0, 1, 1), (1, 1, 0, 12), 'c']] - RMSE 2.452 | AIC 154.784 > Model[[(0, 1, 1), (1, 1, 1, 12), 'c']] - RMSE 2.717 | AIC 1168.315 > Model[[(0, 1, 1), (1, 1, 2, 12), 'c']] - RMSE 2.410 | AIC 79.181 > Model[[(0, 1, 1), (2, 0, 0, 12), 'c']] - RMSE 2.750 | AIC 154.389 > Model[[(0, 1, 1), (2, 0, 1, 12), 'c']] - RMSE 2.499 | AIC 149.867 > Model[[(0, 1, 1), (2, 0, 2, 12), 'c']] - RMSE 2.670 | AIC 140.885 > Model[[(0, 1, 1), (2, 1, 0, 12), 'c']] - RMSE 3.221 | AIC 91.631 > Model[[(0, 1, 1), (2, 1, 1, 12), 'c']] - RMSE 3.122 | AIC 827.388 > Model[[(0, 1, 1), (2, 1, 2, 12), 'c']] - RMSE 2.595 | AIC 77.728 > Model[[(0, 1, 1), (0, 0, 0, 0), 't']] - RMSE 4.770 | AIC 332.174 > Model[[(0, 1, 1), (0, 0, 0, 12), 't']] - RMSE 4.770 | AIC 332.174 > Model[[(0, 1, 1), (0, 0, 1, 12), 't']] - RMSE 4.033 | AIC 241.278 > Model[[(0, 1, 1), (0, 0, 2, 12), 't']] - RMSE 3.316 | AIC 172.841 > Model[[(0, 1, 1), (0, 1, 0, 12), 't']] - RMSE 2.611 | AIC 212.010 > Model[[(0, 1, 1), (0, 1, 1, 12), 't']] - RMSE 2.801 | AIC 968.823 > Model[[(0, 1, 1), (0, 1, 2, 12), 't']] - RMSE 2.218 | AIC 79.882 > Model[[(0, 1, 1), (1, 0, 0, 12), 't']] - RMSE 2.599 | AIC 216.888 > Model[[(0, 1, 1), (1, 0, 1, 12), 't']] - RMSE 2.789 | AIC 203.735 > Model[[(0, 1, 1), (1, 0, 2, 12), 't']] - RMSE 2.183 | AIC 138.872 > Model[[(0, 1, 1), (1, 1, 0, 12), 't']] - RMSE 2.490 | AIC 154.837 > Model[[(0, 1, 1), (1, 1, 1, 12), 't']] - RMSE 2.518 | AIC 928.004 > Model[[(0, 1, 1), (1, 1, 2, 12), 't']] - RMSE 2.299 | AIC 79.208 > Model[[(0, 1, 1), (2, 0, 0, 12), 't']] - RMSE 2.708 | AIC 154.258 > Model[[(0, 1, 1), (2, 0, 1, 12), 't']] - RMSE 2.262 | AIC 150.013 > Model[[(0, 1, 1), (2, 0, 2, 12), 't']] - RMSE 2.767 | AIC 140.861 > Model[[(0, 1, 1), (2, 1, 0, 12), 't']] - RMSE 3.333 | AIC 91.822 > Model[[(0, 1, 1), (2, 1, 1, 12), 't']] - RMSE 2.771 | AIC 616.958 > Model[[(0, 1, 1), (2, 1, 2, 12), 't']] - RMSE 3.668 | AIC 81.258 > Model[[(0, 1, 1), (0, 0, 0, 0), 'ct']] - RMSE 4.838 | AIC 334.167 > Model[[(0, 1, 1), (0, 0, 0, 12), 'ct']] - RMSE 4.838 | AIC 334.167 > Model[[(0, 1, 1), (0, 0, 1, 12), 'ct']] - RMSE 4.104 | AIC 243.183 > Model[[(0, 1, 1), (0, 0, 2, 12), 'ct']] - RMSE 4.050 | AIC 174.798 > Model[[(0, 1, 1), (0, 1, 0, 12), 'ct']] - RMSE 2.722 | AIC 213.670 > Model[[(0, 1, 1), (0, 1, 1, 12), 'ct']] - RMSE 3.073 | AIC 1338.836 > Model[[(0, 1, 1), (0, 1, 2, 12), 'ct']] - RMSE 2.443 | AIC 81.438 > Model[[(0, 1, 1), (1, 0, 0, 12), 'ct']] - RMSE 2.747 | AIC 218.118 > Model[[(0, 1, 1), (1, 0, 1, 12), 'ct']] - RMSE 2.795 | AIC 204.421 > Model[[(0, 1, 1), (1, 0, 2, 12), 'ct']] - RMSE 2.277 | AIC 140.873 > Model[[(0, 1, 1), (1, 1, 0, 12), 'ct']] - RMSE 2.711 | AIC 156.783 > Model[[(0, 1, 1), (1, 1, 1, 12), 'ct']] - RMSE 2.667 | AIC 1298.008 > Model[[(0, 1, 1), (1, 1, 2, 12), 'ct']] - RMSE 2.559 | AIC 81.180 > Model[[(0, 1, 1), (2, 0, 0, 12), 'ct']] - RMSE 3.077 | AIC 156.218 > Model[[(0, 1, 1), (2, 0, 1, 12), 'ct']] - RMSE 3.085 | AIC 151.822 > Model[[(0, 1, 1), (2, 0, 2, 12), 'ct']] - RMSE 3.220 | AIC 142.858 > Model[[(0, 1, 1), (2, 1, 0, 12), 'ct']] - RMSE 3.242 | AIC 93.501 > Model[[(0, 1, 1), (2, 1, 1, 12), 'ct']] - RMSE 2.781 | AIC 900.358 > Model[[(0, 1, 1), (2, 1, 2, 12), 'ct']] - RMSE 2.669 | AIC 83.240 > Model[[(0, 1, 2), (0, 0, 0, 0), 'c']] - RMSE 4.750 | AIC 328.785 > Model[[(0, 1, 2), (0, 0, 0, 12), 'c']] - RMSE 4.750 | AIC 328.785 > Model[[(0, 1, 2), (0, 0, 1, 12), 'c']] - RMSE 4.259 | AIC 240.391 > Model[[(0, 1, 2), (0, 0, 2, 12), 'c']] - RMSE 4.282 | AIC 170.161 > Model[[(0, 1, 2), (0, 1, 0, 12), 'c']] - RMSE 2.569 | AIC 207.458 > Model[[(0, 1, 2), (0, 1, 1, 12), 'c']] - RMSE 2.940 | AIC 1180.516 > Model[[(0, 1, 2), (0, 1, 2, 12), 'c']] - RMSE 2.178 | AIC 74.612 > Model[[(0, 1, 2), (1, 0, 0, 12), 'c']] - RMSE 2.647 | AIC 217.252 > Model[[(0, 1, 2), (1, 0, 1, 12), 'c']] - RMSE 2.934 | AIC 200.784 > Model[[(0, 1, 2), (1, 0, 2, 12), 'c']] - RMSE 1.914 | AIC 127.923 > Model[[(0, 1, 2), (1, 1, 0, 12), 'c']] - RMSE 2.612 | AIC 155.812 > Model[[(0, 1, 2), (1, 1, 1, 12), 'c']] - RMSE 2.397 | AIC 1141.034 > Model[[(0, 1, 2), (1, 1, 2, 12), 'c']] - RMSE 2.067 | AIC 74.215 > Model[[(0, 1, 2), (2, 0, 0, 12), 'c']] - RMSE 2.544 | AIC 155.476 > Model[[(0, 1, 2), (2, 0, 1, 12), 'c']] - RMSE 2.032 | AIC 147.635 > Model[[(0, 1, 2), (2, 0, 2, 12), 'c']] - RMSE 3.011 | AIC 129.617 > Model[[(0, 1, 2), (2, 1, 0, 12), 'c']] - RMSE 3.076 | AIC 90.379 > Model[[(0, 1, 2), (2, 1, 1, 12), 'c']] - RMSE 2.915 | AIC 788.159 > Model[[(0, 1, 2), (2, 1, 2, 12), 'c']] - RMSE 2.517 | AIC 76.392 > Model[[(0, 1, 2), (0, 0, 0, 0), 't']] - RMSE 4.826 | AIC 328.703 > Model[[(0, 1, 2), (0, 0, 0, 12), 't']] - RMSE 4.826 | AIC 328.703 > Model[[(0, 1, 2), (0, 0, 1, 12), 't']] - RMSE 4.371 | AIC 240.426 > Model[[(0, 1, 2), (0, 0, 2, 12), 't']] - RMSE 4.370 | AIC 170.189 > Model[[(0, 1, 2), (0, 1, 0, 12), 't']] - RMSE 2.610 | AIC 207.390 > Model[[(0, 1, 2), (0, 1, 1, 12), 't']] - RMSE 3.031 | AIC 1260.556 > Model[[(0, 1, 2), (0, 1, 2, 12), 't']] - RMSE 2.391 | AIC 74.662 > Model[[(0, 1, 2), (1, 0, 0, 12), 't']] - RMSE 2.669 | AIC 217.107 > Model[[(0, 1, 2), (1, 0, 1, 12), 't']] - RMSE 3.110 | AIC 200.510 > Model[[(0, 1, 2), (1, 0, 2, 12), 't']] - RMSE 2.162 | AIC 128.107 > Model[[(0, 1, 2), (1, 1, 0, 12), 't']] - RMSE 2.692 | AIC 155.938 > Model[[(0, 1, 2), (1, 1, 1, 12), 't']] - RMSE 2.501 | AIC 1221.066 > Model[[(0, 1, 2), (1, 1, 2, 12), 't']] - RMSE 2.436 | AIC 74.425 > Model[[(0, 1, 2), (2, 0, 0, 12), 't']] - RMSE 2.634 | AIC 155.426 > Model[[(0, 1, 2), (2, 0, 1, 12), 't']] - RMSE 2.405 | AIC 148.088 > Model[[(0, 1, 2), (2, 0, 2, 12), 't']] - RMSE 3.160 | AIC 129.705 > Model[[(0, 1, 2), (2, 1, 0, 12), 't']] - RMSE 3.299 | AIC 92.057 > Model[[(0, 1, 2), (2, 1, 1, 12), 't']] - RMSE 4.196 | AIC 802.507 > Model[[(0, 1, 2), (2, 1, 2, 12), 't']] - RMSE 2.403 | AIC 76.964 > Model[[(0, 1, 2), (0, 0, 0, 0), 'ct']] - RMSE 4.898 | AIC 330.703 > Model[[(0, 1, 2), (0, 0, 0, 12), 'ct']] - RMSE 4.898 | AIC 330.703 > Model[[(0, 1, 2), (0, 0, 1, 12), 'ct']] - RMSE 4.450 | AIC 242.391 > Model[[(0, 1, 2), (0, 0, 2, 12), 'ct']] - RMSE 4.784 | AIC 172.161 > Model[[(0, 1, 2), (0, 1, 0, 12), 'ct']] - RMSE 2.712 | AIC 209.205 > Model[[(0, 1, 2), (0, 1, 1, 12), 'ct']] - RMSE 2.989 | AIC 1212.625 > Model[[(0, 1, 2), (0, 1, 2, 12), 'ct']] - RMSE 2.556 | AIC 76.598 > Model[[(0, 1, 2), (1, 0, 0, 12), 'ct']] - RMSE 2.780 | AIC 218.120 > Model[[(0, 1, 2), (1, 0, 1, 12), 'ct']] - RMSE 3.076 | AIC 201.797 > Model[[(0, 1, 2), (1, 0, 2, 12), 'ct']] - RMSE 2.591 | AIC 130.301 > Model[[(0, 1, 2), (1, 1, 0, 12), 'ct']] - RMSE 2.971 | AIC 157.780 > Model[[(0, 1, 2), (1, 1, 1, 12), 'ct']] - RMSE 2.608 | AIC 1173.135 > Model[[(0, 1, 2), (1, 1, 2, 12), 'ct']] - RMSE 2.583 | AIC 76.251 > Model[[(0, 1, 2), (2, 0, 0, 12), 'ct']] - RMSE 2.666 | AIC 157.426 > Model[[(0, 1, 2), (2, 0, 1, 12), 'ct']] - RMSE 2.428 | AIC 150.281 > Model[[(0, 1, 2), (2, 0, 2, 12), 'ct']] - RMSE 3.536 | AIC 131.946 > Model[[(0, 1, 2), (2, 1, 0, 12), 'ct']] - RMSE 3.348 | AIC 90.431 > Model[[(0, 1, 2), (2, 1, 1, 12), 'ct']] - RMSE 3.189 | AIC 852.745 > Model[[(0, 1, 2), (2, 1, 2, 12), 'ct']] - RMSE 2.591 | AIC 78.365 > Model[[(1, 0, 0), (0, 0, 0, 0), 'c']] - RMSE 5.476 | AIC 354.942 > Model[[(1, 0, 0), (0, 0, 0, 12), 'c']] - RMSE 5.476 | AIC 354.942 > Model[[(1, 0, 0), (0, 0, 1, 12), 'c']] - RMSE 4.321 | AIC 259.572 > Model[[(1, 0, 0), (0, 0, 2, 12), 'c']] - RMSE 3.513 | AIC 187.290 > Model[[(1, 0, 0), (0, 1, 0, 12), 'c']] - RMSE 2.585 | AIC 218.409 > Model[[(1, 0, 0), (0, 1, 1, 12), 'c']] - RMSE 2.416 | AIC 1352.117 > Model[[(1, 0, 0), (0, 1, 2, 12), 'c']] - RMSE 2.530 | AIC 86.980 > Model[[(1, 0, 0), (1, 0, 0, 12), 'c']] - RMSE 2.786 | AIC 215.551 > Model[[(1, 0, 0), (1, 0, 1, 12), 'c']] - RMSE 4.203 | AIC 211.274 > Model[[(1, 0, 0), (1, 0, 2, 12), 'c']] - RMSE 2.450 | AIC 144.040 > Model[[(1, 0, 0), (1, 1, 0, 12), 'c']] - RMSE 2.379 | AIC 153.387 > Model[[(1, 0, 0), (1, 1, 1, 12), 'c']] - RMSE 64.952 | AIC 1414.815 > Model[[(1, 0, 0), (1, 1, 2, 12), 'c']] - RMSE 3.202 | AIC 88.775 > Model[[(1, 0, 0), (2, 0, 0, 12), 'c']] - RMSE 2.522 | AIC 153.645 > Model[[(1, 0, 0), (2, 0, 1, 12), 'c']] - RMSE 3.189 | AIC 145.714 > Model[[(1, 0, 0), (2, 0, 2, 12), 'c']] - RMSE 2.573 | AIC 145.207 > Model[[(1, 0, 0), (2, 1, 0, 12), 'c']] - RMSE 5.474 | AIC 86.917 > Model[[(1, 0, 0), (2, 1, 1, 12), 'c']] - RMSE 6.062 | AIC 765.306 > Model[[(1, 0, 0), (2, 1, 2, 12), 'c']] - RMSE 5.465 | AIC 86.620 > Model[[(1, 0, 0), (0, 0, 0, 0), 't']] - RMSE 5.203 | AIC 357.890 > Model[[(1, 0, 0), (0, 0, 0, 12), 't']] - RMSE 5.203 | AIC 357.890 > Model[[(1, 0, 0), (0, 0, 1, 12), 't']] - RMSE 4.452 | AIC 260.394 > Model[[(1, 0, 0), (0, 0, 2, 12), 't']] - RMSE 3.844 | AIC 184.621 > Model[[(1, 0, 0), (0, 1, 0, 12), 't']] - RMSE 2.555 | AIC 217.439 > Model[[(1, 0, 0), (0, 1, 1, 12), 't']] - RMSE 2.464 | AIC 1340.120 > Model[[(1, 0, 0), (0, 1, 2, 12), 't']] - RMSE 2.152 | AIC 85.582 > Model[[(1, 0, 0), (1, 0, 0, 12), 't']] - RMSE 2.532 | AIC 209.032 > Model[[(1, 0, 0), (1, 0, 1, 12), 't']] - RMSE 2.670 | AIC 206.919 > Model[[(1, 0, 0), (1, 0, 2, 12), 't']] - RMSE 1.872 | AIC 140.300 > Model[[(1, 0, 0), (1, 1, 0, 12), 't']] - RMSE 2.358 | AIC 152.452 > Model[[(1, 0, 0), (1, 1, 1, 12), 't']] - RMSE 65.207 | AIC 1402.813 > Model[[(1, 0, 0), (1, 1, 2, 12), 't']] - RMSE 2.750 | AIC 86.173 > Model[[(1, 0, 0), (2, 0, 0, 12), 't']] - RMSE 2.482 | AIC 151.745 > Model[[(1, 0, 0), (2, 0, 1, 12), 't']] - RMSE 2.456 | AIC 144.030 > Model[[(1, 0, 0), (2, 0, 2, 12), 't']] - RMSE 1.925 | AIC 142.263 > Model[[(1, 0, 0), (2, 1, 0, 12), 't']] - RMSE 4.975 | AIC 84.900 > Model[[(1, 0, 0), (2, 1, 1, 12), 't']] - RMSE 5.183 | AIC 712.176 > Model[[(1, 0, 0), (2, 1, 2, 12), 't']] - RMSE 4.408 | AIC 88.288 > Model[[(1, 0, 0), (0, 0, 0, 0), 'ct']] - RMSE 5.219 | AIC 353.773 > Model[[(1, 0, 0), (0, 0, 0, 12), 'ct']] - RMSE 5.219 | AIC 353.773 > Model[[(1, 0, 0), (0, 0, 1, 12), 'ct']] - RMSE 4.391 | AIC 257.134 > Model[[(1, 0, 0), (0, 0, 2, 12), 'ct']] - RMSE 3.845 | AIC 183.543 > Model[[(1, 0, 0), (0, 1, 0, 12), 'ct']] - RMSE 2.493 | AIC 216.686 > Model[[(1, 0, 0), (0, 1, 1, 12), 'ct']] - RMSE 3.139 | AIC 1343.472 > Model[[(1, 0, 0), (0, 1, 2, 12), 'ct']] - RMSE 2.151 | AIC 87.226 > Model[[(1, 0, 0), (1, 0, 0, 12), 'ct']] - RMSE 2.610 | AIC 210.759 > Model[[(1, 0, 0), (1, 0, 1, 12), 'ct']] - RMSE 4.619 | AIC 207.002 > Model[[(1, 0, 0), (1, 0, 2, 12), 'ct']] - RMSE 2.714 | AIC 139.955 > Model[[(1, 0, 0), (1, 1, 0, 12), 'ct']] - RMSE 2.851 | AIC 153.399 > Model[[(1, 0, 0), (1, 1, 1, 12), 'ct']] - RMSE 65.855 | AIC 1406.171 > Model[[(1, 0, 0), (1, 1, 2, 12), 'ct']] - RMSE 2.611 | AIC 84.089 > Model[[(1, 0, 0), (2, 0, 0, 12), 'ct']] - RMSE 2.915 | AIC 152.779 > Model[[(1, 0, 0), (2, 0, 1, 12), 'ct']] - RMSE 3.724 | AIC 153.120 > Model[[(1, 0, 0), (2, 0, 2, 12), 'ct']] - RMSE 3.454 | AIC 146.514 > Model[[(1, 0, 0), (2, 1, 0, 12), 'ct']] - RMSE 3.759 | AIC 81.005 > Model[[(1, 0, 0), (2, 1, 1, 12), 'ct']] - RMSE 4.780 | AIC 878.946 > Model[[(1, 0, 0), (2, 1, 2, 12), 'ct']] - RMSE 3.487 | AIC 88.954 > Model[[(1, 0, 1), (0, 0, 0, 0), 'c']] - RMSE 4.724 | AIC 332.241 > Model[[(1, 0, 1), (0, 0, 0, 12), 'c']] - RMSE 4.724 | AIC 332.241 > Model[[(1, 0, 1), (0, 0, 1, 12), 'c']] - RMSE 3.835 | AIC 243.190 > Model[[(1, 0, 1), (0, 0, 2, 12), 'c']] - RMSE 3.150 | AIC 177.890 > Model[[(1, 0, 1), (0, 1, 0, 12), 'c']] - RMSE 2.657 | AIC 213.615 > Model[[(1, 0, 1), (0, 1, 1, 12), 'c']] - RMSE 2.511 | AIC 1412.864 > Model[[(1, 0, 1), (0, 1, 2, 12), 'c']] - RMSE 3.015 | AIC 86.233 > Model[[(1, 0, 1), (1, 0, 0, 12), 'c']] - RMSE 2.710 | AIC 214.751 > Model[[(1, 0, 1), (1, 0, 1, 12), 'c']] - RMSE 3.426 | AIC 207.753 > Model[[(1, 0, 1), (1, 0, 2, 12), 'c']] - RMSE 2.113 | AIC 143.597 > Model[[(1, 0, 1), (1, 1, 0, 12), 'c']] - RMSE 2.569 | AIC 153.824 > Model[[(1, 0, 1), (1, 1, 1, 12), 'c']] - RMSE 4.054 | AIC 1473.776 > Model[[(1, 0, 1), (1, 1, 2, 12), 'c']] - RMSE 3.385 | AIC 82.032 > Model[[(1, 0, 1), (2, 0, 0, 12), 'c']] - RMSE 2.653 | AIC 154.618 > Model[[(1, 0, 1), (2, 0, 1, 12), 'c']] - RMSE 3.049 | AIC 146.816 > Model[[(1, 0, 1), (2, 0, 2, 12), 'c']] - RMSE 2.321 | AIC 145.202 > Model[[(1, 0, 1), (2, 1, 0, 12), 'c']] - RMSE 3.138 | AIC 87.880 > Model[[(1, 0, 1), (2, 1, 1, 12), 'c']] - RMSE 3.351 | AIC 817.109 > Model[[(1, 0, 1), (2, 1, 2, 12), 'c']] - RMSE 4.022 | AIC 83.691 > Model[[(1, 0, 1), (0, 0, 0, 0), 't']] - RMSE 4.703 | AIC 338.332 > Model[[(1, 0, 1), (0, 0, 0, 12), 't']] - RMSE 4.703 | AIC 338.332 > Model[[(1, 0, 1), (0, 0, 1, 12), 't']] - RMSE 4.136 | AIC 246.900 > Model[[(1, 0, 1), (0, 0, 2, 12), 't']] - RMSE 3.623 | AIC 176.274 > Model[[(1, 0, 1), (0, 1, 0, 12), 't']] - RMSE 2.592 | AIC 212.265 > Model[[(1, 0, 1), (0, 1, 1, 12), 't']] - RMSE 2.496 | AIC 1314.265 > Model[[(1, 0, 1), (0, 1, 2, 12), 't']] - RMSE 2.903 | AIC 85.083 > Model[[(1, 0, 1), (1, 0, 0, 12), 't']] - RMSE 2.468 | AIC 208.180 > Model[[(1, 0, 1), (1, 0, 1, 12), 't']] - RMSE 2.461 | AIC 201.995 > Model[[(1, 0, 1), (1, 0, 2, 12), 't']] - RMSE 1.979 | AIC 137.452 > Model[[(1, 0, 1), (1, 1, 0, 12), 't']] - RMSE 2.517 | AIC 152.616 > Model[[(1, 0, 1), (1, 1, 1, 12), 't']] - RMSE 9.481 | AIC 1375.188 > Model[[(1, 0, 1), (1, 1, 2, 12), 't']] - RMSE 3.191 | AIC 83.999 > Model[[(1, 0, 1), (2, 0, 0, 12), 't']] - RMSE 2.541 | AIC 152.011 > Model[[(1, 0, 1), (2, 0, 1, 12), 't']] - RMSE 2.234 | AIC 143.119 > Model[[(1, 0, 1), (2, 0, 2, 12), 't']] - RMSE 2.358 | AIC 139.418 > Model[[(1, 0, 1), (2, 1, 0, 12), 't']] - RMSE 3.283 | AIC 87.002 > Model[[(1, 0, 1), (2, 1, 1, 12), 't']] - RMSE 3.422 | AIC 881.034 > Model[[(1, 0, 1), (2, 1, 2, 12), 't']] - RMSE 3.326 | AIC 85.218 > Model[[(1, 0, 1), (0, 0, 0, 0), 'ct']] - RMSE 4.597 | AIC 332.070 > Model[[(1, 0, 1), (0, 0, 0, 12), 'ct']] - RMSE 4.597 | AIC 332.070 > Model[[(1, 0, 1), (0, 0, 1, 12), 'ct']] - RMSE 3.915 | AIC 241.769 > Model[[(1, 0, 1), (0, 0, 2, 12), 'ct']] - RMSE 3.373 | AIC 173.592 > Model[[(1, 0, 1), (0, 1, 0, 12), 'ct']] - RMSE 2.440 | AIC 208.616 > Model[[(1, 0, 1), (0, 1, 1, 12), 'ct']] - RMSE 3.246 | AIC 1308.800 > Model[[(1, 0, 1), (0, 1, 2, 12), 'ct']] - RMSE 2.732 | AIC 86.142 > Model[[(1, 0, 1), (1, 0, 0, 12), 'ct']] - RMSE 2.483 | AIC 210.178 > Model[[(1, 0, 1), (1, 0, 1, 12), 'ct']] - RMSE 3.374 | AIC 201.380 > Model[[(1, 0, 1), (1, 0, 2, 12), 'ct']] - RMSE 2.746 | AIC 142.322 > Model[[(1, 0, 1), (1, 1, 0, 12), 'ct']] - RMSE 2.584 | AIC 152.974 > Model[[(1, 0, 1), (1, 1, 1, 12), 'ct']] - RMSE 28.783 | AIC 1369.703 > Model[[(1, 0, 1), (1, 1, 2, 12), 'ct']] - RMSE 2.775 | AIC 84.164 > Model[[(1, 0, 1), (2, 0, 0, 12), 'ct']] - RMSE 3.021 | AIC 152.970 > Model[[(1, 0, 1), (2, 0, 1, 12), 'ct']] - RMSE 3.362 | AIC 162.215 > Model[[(1, 0, 1), (2, 0, 2, 12), 'ct']] - RMSE 3.448 | AIC 160.762 > Model[[(1, 0, 1), (2, 1, 0, 12), 'ct']] - RMSE 4.701 | AIC 80.231 > Model[[(1, 0, 1), (2, 1, 1, 12), 'ct']] - RMSE 4.300 | AIC 923.893 > Model[[(1, 0, 1), (2, 1, 2, 12), 'ct']] - RMSE 2.813 | AIC 86.331 > Model[[(1, 0, 2), (0, 0, 0, 0), 'c']] - RMSE 4.605 | AIC 325.613 > Model[[(1, 0, 2), (0, 0, 0, 12), 'c']] - RMSE 4.605 | AIC 325.613 > Model[[(1, 0, 2), (0, 0, 1, 12), 'c']] - RMSE 3.830 | AIC 239.312 > Model[[(1, 0, 2), (0, 0, 2, 12), 'c']] - RMSE 3.521 | AIC 172.839 > Model[[(1, 0, 2), (0, 1, 0, 12), 'c']] - RMSE 2.741 | AIC 211.198 > Model[[(1, 0, 2), (0, 1, 1, 12), 'c']] - RMSE 2.826 | AIC 1225.665 > Model[[(1, 0, 2), (0, 1, 2, 12), 'c']] - RMSE 2.563 | AIC 81.203 > Model[[(1, 0, 2), (1, 0, 0, 12), 'c']] - RMSE 2.806 | AIC 216.221 > Model[[(1, 0, 2), (1, 0, 1, 12), 'c']] - RMSE 4.338 | AIC 226.567 > Model[[(1, 0, 2), (1, 0, 2, 12), 'c']] - RMSE 2.956 | AIC 154.216 > Model[[(1, 0, 2), (1, 1, 0, 12), 'c']] - RMSE 2.759 | AIC 155.395 > Model[[(1, 0, 2), (1, 1, 1, 12), 'c']] - RMSE 13.201 | AIC 1284.783 > Model[[(1, 0, 2), (1, 1, 2, 12), 'c']] - RMSE 2.865 | AIC 79.195 > Model[[(1, 0, 2), (2, 0, 0, 12), 'c']] - RMSE 2.839 | AIC 156.043 > Model[[(1, 0, 2), (2, 0, 1, 12), 'c']] - RMSE 4.171 | AIC 170.331 > Model[[(1, 0, 2), (2, 0, 2, 12), 'c']] - RMSE 2.855 | AIC 149.522 > Model[[(1, 0, 2), (2, 1, 0, 12), 'c']] - RMSE 2.373 | AIC 84.156 > Model[[(1, 0, 2), (2, 1, 1, 12), 'c']] - RMSE 3.426 | AIC 853.857 > Model[[(1, 0, 2), (2, 1, 2, 12), 'c']] - RMSE 3.283 | AIC 78.977 > Model[[(1, 0, 2), (0, 0, 0, 0), 't']] - RMSE 4.703 | AIC 333.372 > Model[[(1, 0, 2), (0, 0, 0, 12), 't']] - RMSE 4.703 | AIC 333.372 > Model[[(1, 0, 2), (0, 0, 1, 12), 't']] - RMSE 4.171 | AIC 243.817 > Model[[(1, 0, 2), (0, 0, 2, 12), 't']] - RMSE 4.029 | AIC 173.418 > Model[[(1, 0, 2), (0, 1, 0, 12), 't']] - RMSE 2.681 | AIC 210.066 > Model[[(1, 0, 2), (0, 1, 1, 12), 't']] - RMSE 2.899 | AIC 1239.890 > Model[[(1, 0, 2), (0, 1, 2, 12), 't']] - RMSE 3.436 | AIC 82.154 > Model[[(1, 0, 2), (1, 0, 0, 12), 't']] - RMSE 2.579 | AIC 209.138 > Model[[(1, 0, 2), (1, 0, 1, 12), 't']] - RMSE 2.901 | AIC 200.378 > Model[[(1, 0, 2), (1, 0, 2, 12), 't']] - RMSE 1.997 | AIC 133.196 > Model[[(1, 0, 2), (1, 1, 0, 12), 't']] - RMSE 2.728 | AIC 154.330 > Model[[(1, 0, 2), (1, 1, 1, 12), 't']] - RMSE 15.455 | AIC 1299.018 > Model[[(1, 0, 2), (1, 1, 2, 12), 't']] - RMSE 3.459 | AIC 79.479 > Model[[(1, 0, 2), (2, 0, 0, 12), 't']] - RMSE 3.011 | AIC 153.884 > Model[[(1, 0, 2), (2, 0, 1, 12), 't']] - RMSE 2.393 | AIC 143.898 > Model[[(1, 0, 2), (2, 0, 2, 12), 't']] - RMSE 2.107 | AIC 135.387 > Model[[(1, 0, 2), (2, 1, 0, 12), 't']] - RMSE 2.744 | AIC 86.426 > Model[[(1, 0, 2), (2, 1, 1, 12), 't']] - RMSE 3.877 | AIC 929.074 > Model[[(1, 0, 2), (2, 1, 2, 12), 't']] - RMSE 2.513 | AIC 77.086 > Model[[(1, 0, 2), (0, 0, 0, 0), 'ct']] - RMSE 4.501 | AIC 325.555 > Model[[(1, 0, 2), (0, 0, 0, 12), 'ct']] - RMSE 4.501 | AIC 325.555 > Model[[(1, 0, 2), (0, 0, 1, 12), 'ct']] - RMSE 3.936 | AIC 237.114 > Model[[(1, 0, 2), (0, 0, 2, 12), 'ct']] - RMSE 3.714 | AIC 170.684 > Model[[(1, 0, 2), (0, 1, 0, 12), 'ct']] - RMSE 2.545 | AIC 206.334 > Model[[(1, 0, 2), (0, 1, 1, 12), 'ct']] - RMSE 3.858 | AIC 1508.770 > Model[[(1, 0, 2), (0, 1, 2, 12), 'ct']] - RMSE 2.979 | AIC 83.088 > Model[[(1, 0, 2), (1, 0, 0, 12), 'ct']] - RMSE 3.109 | AIC 211.278 > Model[[(1, 0, 2), (1, 0, 1, 12), 'ct']] - RMSE 4.337 | AIC 222.713 > Model[[(1, 0, 2), (1, 0, 2, 12), 'ct']] - RMSE 3.587 | AIC 166.822 > Model[[(1, 0, 2), (1, 1, 0, 12), 'ct']] - RMSE 2.565 | AIC 154.970 > Model[[(1, 0, 2), (1, 1, 1, 12), 'ct']] - RMSE 17.088 | AIC 1567.897 > Model[[(1, 0, 2), (1, 1, 2, 12), 'ct']] - RMSE 2.856 | AIC 81.534 > Model[[(1, 0, 2), (2, 0, 0, 12), 'ct']] - RMSE 2.934 | AIC 154.966 > Model[[(1, 0, 2), (2, 0, 1, 12), 'ct']] - RMSE 4.641 | AIC 171.762 > Model[[(1, 0, 2), (2, 0, 2, 12), 'ct']] - RMSE 3.577 | AIC 160.869 > Model[[(1, 0, 2), (2, 1, 0, 12), 'ct']] - RMSE 3.461 | AIC 82.026 > Model[[(1, 0, 2), (2, 1, 1, 12), 'ct']] - RMSE 4.974 | AIC 869.043 > Model[[(1, 0, 2), (2, 1, 2, 12), 'ct']] - RMSE 2.763 | AIC 84.618 > Model[[(1, 1, 0), (0, 0, 0, 0), 'c']] - RMSE 4.717 | AIC 338.925 > Model[[(1, 1, 0), (0, 0, 0, 12), 'c']] - RMSE 4.717 | AIC 338.925 > Model[[(1, 1, 0), (0, 0, 1, 12), 'c']] - RMSE 4.116 | AIC 248.254 > Model[[(1, 1, 0), (0, 0, 2, 12), 'c']] - RMSE 3.338 | AIC 176.210 > Model[[(1, 1, 0), (0, 1, 0, 12), 'c']] - RMSE 2.610 | AIC 216.509 > Model[[(1, 1, 0), (0, 1, 1, 12), 'c']] - RMSE 2.593 | AIC 1472.346 > Model[[(1, 1, 0), (0, 1, 2, 12), 'c']] - RMSE 9.961 | AIC 86.745 > Model[[(1, 1, 0), (1, 0, 0, 12), 'c']] - RMSE 2.607 | AIC 213.946 > Model[[(1, 1, 0), (1, 0, 1, 12), 'c']] - RMSE 2.754 | AIC 208.272 > Model[[(1, 1, 0), (1, 0, 2, 12), 'c']] - RMSE 1.918 | AIC 143.128 > Model[[(1, 1, 0), (1, 1, 0, 12), 'c']] - RMSE 2.469 | AIC 149.000 > Model[[(1, 1, 0), (1, 1, 1, 12), 'c']] - RMSE 2.588 | AIC 1430.179 > Model[[(1, 1, 0), (1, 1, 2, 12), 'c']] - RMSE 4.230 | AIC 88.267 > Model[[(1, 1, 0), (2, 0, 0, 12), 'c']] - RMSE 2.573 | AIC 149.996 > Model[[(1, 1, 0), (2, 0, 1, 12), 'c']] - RMSE 2.298 | AIC 144.262 > Model[[(1, 1, 0), (2, 0, 2, 12), 'c']] - RMSE 2.389 | AIC 144.123 > Model[[(1, 1, 0), (2, 1, 0, 12), 'c']] - RMSE 3.726 | AIC 86.924 > Model[[(1, 1, 0), (2, 1, 1, 12), 'c']] - RMSE 3.293 | AIC 779.763 > Model[[(1, 1, 0), (2, 1, 2, 12), 'c']] - RMSE 3.152 | AIC 89.340 > Model[[(1, 1, 0), (0, 0, 0, 0), 't']] - RMSE 4.793 | AIC 338.855 > Model[[(1, 1, 0), (0, 0, 0, 12), 't']] - RMSE 4.793 | AIC 338.855 > Model[[(1, 1, 0), (0, 0, 1, 12), 't']] - RMSE 4.200 | AIC 248.221 > Model[[(1, 1, 0), (0, 0, 2, 12), 't']] - RMSE 3.402 | AIC 176.232 > Model[[(1, 1, 0), (0, 1, 0, 12), 't']] - RMSE 2.639 | AIC 216.490 > Model[[(1, 1, 0), (0, 1, 1, 12), 't']] - RMSE 2.674 | AIC 1301.255 > Model[[(1, 1, 0), (0, 1, 2, 12), 't']] - RMSE 11.046 | AIC 86.746 > Model[[(1, 1, 0), (1, 0, 0, 12), 't']] - RMSE 2.634 | AIC 213.820 > Model[[(1, 1, 0), (1, 0, 1, 12), 't']] - RMSE 2.835 | AIC 208.453 > Model[[(1, 1, 0), (1, 0, 2, 12), 't']] - RMSE 1.949 | AIC 143.120 > Model[[(1, 1, 0), (1, 1, 0, 12), 't']] - RMSE 2.500 | AIC 148.820 > Model[[(1, 1, 0), (1, 1, 1, 12), 't']] - RMSE 2.504 | AIC 1259.088 > Model[[(1, 1, 0), (1, 1, 2, 12), 't']] - RMSE 4.837 | AIC 88.266 > Model[[(1, 1, 0), (2, 0, 0, 12), 't']] - RMSE 2.655 | AIC 149.751 > Model[[(1, 1, 0), (2, 0, 1, 12), 't']] - RMSE 2.424 | AIC 144.153 > Model[[(1, 1, 0), (2, 0, 2, 12), 't']] - RMSE 2.486 | AIC 144.043 > Model[[(1, 1, 0), (2, 1, 0, 12), 't']] - RMSE 3.172 | AIC 86.980 > Model[[(1, 1, 0), (2, 1, 1, 12), 't']] - RMSE 3.246 | AIC 790.074 > Model[[(1, 1, 0), (2, 1, 2, 12), 't']] - RMSE 3.527 | AIC 90.807 > Model[[(1, 1, 0), (0, 0, 0, 0), 'ct']] - RMSE 4.850 | AIC 340.854 > Model[[(1, 1, 0), (0, 0, 0, 12), 'ct']] - RMSE 4.850 | AIC 340.854 > Model[[(1, 1, 0), (0, 0, 1, 12), 'ct']] - RMSE 4.269 | AIC 250.220 > Model[[(1, 1, 0), (0, 0, 2, 12), 'ct']] - RMSE 3.998 | AIC 178.199 > Model[[(1, 1, 0), (0, 1, 0, 12), 'ct']] - RMSE 2.735 | AIC 217.878 > Model[[(1, 1, 0), (0, 1, 1, 12), 'ct']] - RMSE 2.971 | AIC 1245.170 > Model[[(1, 1, 0), (0, 1, 2, 12), 'ct']] - RMSE 9.903 | AIC 88.736 > Model[[(1, 1, 0), (1, 0, 0, 12), 'ct']] - RMSE 2.753 | AIC 215.019 > Model[[(1, 1, 0), (1, 0, 1, 12), 'ct']] - RMSE 2.821 | AIC 208.556 > Model[[(1, 1, 0), (1, 0, 2, 12), 'ct']] - RMSE 2.266 | AIC 145.113 > Model[[(1, 1, 0), (1, 1, 0, 12), 'ct']] - RMSE 2.665 | AIC 150.565 > Model[[(1, 1, 0), (1, 1, 1, 12), 'ct']] - RMSE 2.632 | AIC 1203.011 > Model[[(1, 1, 0), (1, 1, 2, 12), 'ct']] - RMSE 5.454 | AIC 90.155 > Model[[(1, 1, 0), (2, 0, 0, 12), 'ct']] - RMSE 3.114 | AIC 151.354 > Model[[(1, 1, 0), (2, 0, 1, 12), 'ct']] - RMSE 2.915 | AIC 146.105 > Model[[(1, 1, 0), (2, 0, 2, 12), 'ct']] - RMSE 2.866 | AIC 146.454 > Model[[(1, 1, 0), (2, 1, 0, 12), 'ct']] - RMSE 3.164 | AIC 88.922 > Model[[(1, 1, 0), (2, 1, 1, 12), 'ct']] - RMSE 2.574 | AIC 812.711 > Model[[(1, 1, 0), (2, 1, 2, 12), 'ct']] - RMSE 2.772 | AIC 92.951 > Model[[(1, 1, 1), (0, 0, 0, 0), 'c']] - RMSE 4.725 | AIC 334.013 > Model[[(1, 1, 1), (0, 0, 0, 12), 'c']] - RMSE 4.725 | AIC 334.013 > Model[[(1, 1, 1), (0, 0, 1, 12), 'c']] - RMSE 4.002 | AIC 242.709 > Model[[(1, 1, 1), (0, 0, 2, 12), 'c']] - RMSE 3.375 | AIC 173.087 > Model[[(1, 1, 1), (0, 1, 0, 12), 'c']] - RMSE 2.568 | AIC 210.834 > Model[[(1, 1, 1), (0, 1, 1, 12), 'c']] - RMSE 2.671 | AIC 859.300 > Model[[(1, 1, 1), (0, 1, 2, 12), 'c']] - RMSE 2.399 | AIC 82.140 > Model[[(1, 1, 1), (1, 0, 0, 12), 'c']] - RMSE 2.485 | AIC 215.388 > Model[[(1, 1, 1), (1, 0, 1, 12), 'c']] - RMSE 2.848 | AIC 203.276 > Model[[(1, 1, 1), (1, 0, 2, 12), 'c']] - RMSE 2.048 | AIC 137.750 > Model[[(1, 1, 1), (1, 1, 0, 12), 'c']] - RMSE 2.654 | AIC 149.092 > Model[[(1, 1, 1), (1, 1, 1, 12), 'c']] - RMSE 2.781 | AIC 818.312 > Model[[(1, 1, 1), (1, 1, 2, 12), 'c']] - RMSE 2.589 | AIC 81.175 > Model[[(1, 1, 1), (2, 0, 0, 12), 'c']] - RMSE 3.320 | AIC 151.819 > Model[[(1, 1, 1), (2, 0, 1, 12), 'c']] - RMSE 2.879 | AIC 145.315 > Model[[(1, 1, 1), (2, 0, 2, 12), 'c']] - RMSE 2.750 | AIC 139.456 > Model[[(1, 1, 1), (2, 1, 0, 12), 'c']] - RMSE 3.536 | AIC 86.025 > Model[[(1, 1, 1), (2, 1, 1, 12), 'c']] - RMSE 4.451 | AIC 799.398 > Model[[(1, 1, 1), (2, 1, 2, 12), 'c']] - RMSE 3.114 | AIC 78.675 > Model[[(1, 1, 1), (0, 0, 0, 0), 't']] - RMSE 4.799 | AIC 333.966 > Model[[(1, 1, 1), (0, 0, 0, 12), 't']] - RMSE 4.799 | AIC 333.966 > Model[[(1, 1, 1), (0, 0, 1, 12), 't']] - RMSE 4.094 | AIC 242.834 > Model[[(1, 1, 1), (0, 0, 2, 12), 't']] - RMSE 3.455 | AIC 173.241 > Model[[(1, 1, 1), (0, 1, 0, 12), 't']] - RMSE 2.603 | AIC 210.794 > Model[[(1, 1, 1), (0, 1, 1, 12), 't']] - RMSE 2.744 | AIC 1137.596 > Model[[(1, 1, 1), (0, 1, 2, 12), 't']] - RMSE 2.435 | AIC 81.877 > Model[[(1, 1, 1), (1, 0, 0, 12), 't']] - RMSE 2.551 | AIC 211.574 > Model[[(1, 1, 1), (1, 0, 1, 12), 't']] - RMSE 2.864 | AIC 202.992 > Model[[(1, 1, 1), (1, 0, 2, 12), 't']] - RMSE 2.058 | AIC 140.932 > Model[[(1, 1, 1), (1, 1, 0, 12), 't']] - RMSE 2.715 | AIC 148.976 > Model[[(1, 1, 1), (1, 1, 1, 12), 't']] - RMSE 2.833 | AIC 1096.763 > Model[[(1, 1, 1), (1, 1, 2, 12), 't']] - RMSE 2.618 | AIC 81.183 > Model[[(1, 1, 1), (2, 0, 0, 12), 't']] - RMSE 3.382 | AIC 151.394 > Model[[(1, 1, 1), (2, 0, 1, 12), 't']] - RMSE 2.928 | AIC 146.431 > Model[[(1, 1, 1), (2, 0, 2, 12), 't']] - RMSE 2.848 | AIC 139.960 > Model[[(1, 1, 1), (2, 1, 0, 12), 't']] - RMSE 3.517 | AIC 85.970 > Model[[(1, 1, 1), (2, 1, 1, 12), 't']] - RMSE 2.479 | AIC 442.054 > Model[[(1, 1, 1), (2, 1, 2, 12), 't']] - RMSE 2.493 | AIC 83.227 > Model[[(1, 1, 1), (0, 0, 0, 0), 'ct']] - RMSE 4.866 | AIC 335.962 > Model[[(1, 1, 1), (0, 0, 0, 12), 'ct']] - RMSE 4.866 | AIC 335.962 > Model[[(1, 1, 1), (0, 0, 1, 12), 'ct']] - RMSE 4.176 | AIC 244.690 > Model[[(1, 1, 1), (0, 0, 2, 12), 'ct']] - RMSE 3.926 | AIC 175.085 > Model[[(1, 1, 1), (0, 1, 0, 12), 'ct']] - RMSE 2.701 | AIC 212.392 > Model[[(1, 1, 1), (0, 1, 1, 12), 'ct']] - RMSE 2.996 | AIC 1244.884 > Model[[(1, 1, 1), (0, 1, 2, 12), 'ct']] - RMSE 2.683 | AIC 83.795 > Model[[(1, 1, 1), (1, 0, 0, 12), 'ct']] - RMSE 2.816 | AIC 212.672 > Model[[(1, 1, 1), (1, 0, 1, 12), 'ct']] - RMSE 2.850 | AIC 206.421 > Model[[(1, 1, 1), (1, 0, 2, 12), 'ct']] - RMSE 2.474 | AIC 143.013 > Model[[(1, 1, 1), (1, 1, 0, 12), 'ct']] - RMSE 2.885 | AIC 150.870 > Model[[(1, 1, 1), (1, 1, 1, 12), 'ct']] - RMSE 2.779 | AIC 1204.055 > Model[[(1, 1, 1), (1, 1, 2, 12), 'ct']] - RMSE 2.999 | AIC 83.256 > Model[[(1, 1, 1), (2, 0, 0, 12), 'ct']] - RMSE 3.733 | AIC 153.145 > Model[[(1, 1, 1), (2, 0, 1, 12), 'ct']] - RMSE 3.416 | AIC 148.432 > Model[[(1, 1, 1), (2, 0, 2, 12), 'ct']] - RMSE 3.218 | AIC 142.702 > Model[[(1, 1, 1), (2, 1, 0, 12), 'ct']] - RMSE 3.043 | AIC 87.969 > Model[[(1, 1, 1), (2, 1, 1, 12), 'ct']] - RMSE 2.477 | AIC 814.711 > Model[[(1, 1, 1), (2, 1, 2, 12), 'ct']] - RMSE 2.530 | AIC 85.275 > Model[[(1, 1, 2), (0, 0, 0, 0), 'c']] - RMSE 4.943 | AIC 325.378 > Model[[(1, 1, 2), (0, 0, 0, 12), 'c']] - RMSE 4.943 | AIC 325.378 > Model[[(1, 1, 2), (0, 0, 1, 12), 'c']] - RMSE 4.164 | AIC 234.959 > Model[[(1, 1, 2), (0, 0, 2, 12), 'c']] - RMSE 3.835 | AIC 166.452 > Model[[(1, 1, 2), (0, 1, 0, 12), 'c']] - RMSE 2.657 | AIC 206.558 > Model[[(1, 1, 2), (0, 1, 1, 12), 'c']] - RMSE 3.064 | AIC 1172.688 > Model[[(1, 1, 2), (0, 1, 2, 12), 'c']] - RMSE 3.314 | AIC 75.605 > Model[[(1, 1, 2), (1, 0, 0, 12), 'c']] - RMSE 2.532 | AIC 210.121 > Model[[(1, 1, 2), (1, 0, 1, 12), 'c']] - RMSE 2.633 | AIC 197.664 > Model[[(1, 1, 2), (1, 0, 2, 12), 'c']] - RMSE 1.959 | AIC 129.517 > Model[[(1, 1, 2), (1, 1, 0, 12), 'c']] - RMSE 2.791 | AIC 149.562 > Model[[(1, 1, 2), (1, 1, 1, 12), 'c']] - RMSE 2.669 | AIC 1133.191 > Model[[(1, 1, 2), (1, 1, 2, 12), 'c']] - RMSE 3.391 | AIC 74.495 > Model[[(1, 1, 2), (2, 0, 0, 12), 'c']] - RMSE 2.901 | AIC 152.865 > Model[[(1, 1, 2), (2, 0, 1, 12), 'c']] - RMSE 2.654 | AIC 142.968 > Model[[(1, 1, 2), (2, 0, 2, 12), 'c']] - RMSE 2.279 | AIC 131.151 > Model[[(1, 1, 2), (2, 1, 0, 12), 'c']] - RMSE 4.860 | AIC 84.346 > Model[[(1, 1, 2), (2, 1, 1, 12), 'c']] - RMSE 3.775 | AIC 867.031 > Model[[(1, 1, 2), (2, 1, 2, 12), 'c']] - RMSE 3.646 | AIC 75.512 > Model[[(1, 1, 2), (0, 0, 0, 0), 't']] - RMSE 4.963 | AIC 324.865 > Model[[(1, 1, 2), (0, 0, 0, 12), 't']] - RMSE 4.963 | AIC 324.865 > Model[[(1, 1, 2), (0, 0, 1, 12), 't']] - RMSE 4.333 | AIC 234.820 > Model[[(1, 1, 2), (0, 0, 2, 12), 't']] - RMSE 3.747 | AIC 167.252 > Model[[(1, 1, 2), (0, 1, 0, 12), 't']] - RMSE 2.723 | AIC 206.966 > Model[[(1, 1, 2), (0, 1, 1, 12), 't']] - RMSE 21.628 | AIC 1169.611 > Model[[(1, 1, 2), (0, 1, 2, 12), 't']] - RMSE 6.879 | AIC 77.293 > Model[[(1, 1, 2), (1, 0, 0, 12), 't']] - RMSE 2.548 | AIC 210.499 > Model[[(1, 1, 2), (1, 0, 1, 12), 't']] - RMSE 2.792 | AIC 201.317 > Model[[(1, 1, 2), (1, 0, 2, 12), 't']] - RMSE 2.038 | AIC 129.439 > Model[[(1, 1, 2), (1, 1, 0, 12), 't']] - RMSE 2.762 | AIC 149.646 > Model[[(1, 1, 2), (1, 1, 1, 12), 't']] - RMSE 5.241 | AIC 1130.123 > Model[[(1, 1, 2), (1, 1, 2, 12), 't']] - RMSE 3.486 | AIC 78.171 > Model[[(1, 1, 2), (2, 0, 0, 12), 't']] - RMSE 2.931 | AIC 152.687 > Model[[(1, 1, 2), (2, 0, 1, 12), 't']] - RMSE 2.613 | AIC 144.144 > Model[[(1, 1, 2), (2, 0, 2, 12), 't']] - RMSE 2.546 | AIC 131.352 > Model[[(1, 1, 2), (2, 1, 0, 12), 't']] - RMSE 4.480 | AIC 85.978 > Model[[(1, 1, 2), (2, 1, 1, 12), 't']] - RMSE 5.791 | AIC 797.338 > Model[[(1, 1, 2), (2, 1, 2, 12), 't']] - RMSE 5.615 | AIC 82.954 > Model[[(1, 1, 2), (0, 0, 0, 0), 'ct']] - RMSE 5.022 | AIC 326.634 > Model[[(1, 1, 2), (0, 0, 0, 12), 'ct']] - RMSE 5.022 | AIC 326.634 > Model[[(1, 1, 2), (0, 0, 1, 12), 'ct']] - RMSE 4.486 | AIC 236.817 > Model[[(1, 1, 2), (0, 0, 2, 12), 'ct']] - RMSE 4.130 | AIC 168.148 > Model[[(1, 1, 2), (0, 1, 0, 12), 'ct']] - RMSE 2.811 | AIC 208.387 > Model[[(1, 1, 2), (0, 1, 1, 12), 'ct']] - RMSE 13.898 | AIC 1051.159 > Model[[(1, 1, 2), (0, 1, 2, 12), 'ct']] - RMSE 8.663 | AIC 77.869 > Model[[(1, 1, 2), (1, 0, 0, 12), 'ct']] - RMSE 2.744 | AIC 212.212 > Model[[(1, 1, 2), (1, 0, 1, 12), 'ct']] - RMSE 3.063 | AIC 201.954 > Model[[(1, 1, 2), (1, 0, 2, 12), 'ct']] - RMSE 2.332 | AIC 132.441 > Model[[(1, 1, 2), (1, 1, 0, 12), 'ct']] - RMSE 3.177 | AIC 151.578 > Model[[(1, 1, 2), (1, 1, 1, 12), 'ct']] - RMSE 5.452 | AIC 1011.669 > Model[[(1, 1, 2), (1, 1, 2, 12), 'ct']] - RMSE 8.698 | AIC 80.160 > Model[[(1, 1, 2), (2, 0, 0, 12), 'ct']] - RMSE 3.006 | AIC 154.497 > Model[[(1, 1, 2), (2, 0, 1, 12), 'ct']] - RMSE 2.653 | AIC 145.278 > Model[[(1, 1, 2), (2, 0, 2, 12), 'ct']] - RMSE 2.498 | AIC 133.420 > Model[[(1, 1, 2), (2, 1, 0, 12), 'ct']] - RMSE 4.271 | AIC 87.340 > Model[[(1, 1, 2), (2, 1, 1, 12), 'ct']] - RMSE 3.834 | AIC 755.425 > Model[[(1, 1, 2), (2, 1, 2, 12), 'ct']] - RMSE 5.687 | AIC 82.117 done [(1, 0, 0), (1, 0, 2, 12), 't'] 1.8715427519171566 [(0, 1, 0), (1, 0, 2, 12), 'c'] 1.905784046375752 [(0, 1, 2), (1, 0, 2, 12), 'c'] 1.9142961327031127
Jose Medina Observations¶
- Our prior ARIMA attempt had Test Data RMSE:
8.1462, and Train Data RMSE:11.0983We can see that our SARIMA grid search identified a set of parameters that produces the RMSE of
1.872- which is far better.It has the following terms:
- [(1, 0, 0), (1, 0, 2, 12), 't']
- Non-seasonal portion:
- p = 1, d = 0, q = 0, where p is the auto-regressive term, these p lags will act as our features while forecasting the AR element of the non-seasonal portion of the time series.
- Seasonal portion:
- P = 1, d = 0, q = 2, m = 12, where m indicates a 12 month repeating seasonal cycle
- 't' indicates a 'linear' trend parameter is optimal
- Let's fit our SARIMA model with these new parameters, and do the following:
- Produce a 24 month forecast
# define model configuration
my_order = (1, 0, 0)
my_seasonal_order = (1, 0, 2, 12)
my_trend = 't'
# Let's split into test / train to make it easy to visualize how our model performs
df_test = df_2000.loc['2014-08-31':'2016-07-31']
df_train = df_2000.loc['2000-01-31':'2014-08-30']
model = SARIMAX(df_2000, order=my_order, seasonal_order=my_seasonal_order, trend=my_trend)
res = model.fit()
# In-sample one-step-ahead predictions
# predict = res.get_prediction()
# predict_ci = predict.conf_int()
yhat = res.predict()
# Lastly, let's address part of the clients need
# Make 24 months of predictions
fcst_24 = res.forecast(24)
fcst_24
2016-08-31 62.636586 2016-09-30 52.167718 2016-10-31 46.408796 2016-11-30 41.975846 2016-12-31 43.637706 2017-01-31 44.264516 2017-02-28 41.434013 2017-03-31 43.202059 2017-04-30 43.210992 2017-05-31 46.959896 2017-06-30 53.375416 2017-07-31 62.356850 2017-08-31 63.291981 2017-09-30 53.279122 2017-10-31 48.054064 2017-11-30 43.855257 2017-12-31 45.749901 2018-01-31 46.624261 2018-02-28 44.103066 2018-03-31 45.891586 2018-04-30 46.024815 2018-05-31 49.750565 2018-06-30 55.924976 2018-07-31 64.898492 Freq: M, Name: predicted_mean, dtype: float64
#Plotting the original vs predicted series and Forecast
# Filtering to year 2000 and beyond to improve readability.
plt.figure(figsize=(16,8))
plt.plot(yhat, label = 'Train - Prediction', color='b')
plt.plot(df_train, label = 'Train', color='y')
plt.plot(df_test, label = 'Test', color='g')
plt.plot(fcst_24, label = '24 Mo Fcst', color='r')
plt.title('Actual vs Predicted vs. Train', fontsize=20)
plt.xticks(fontsize=20)
plt.yticks(fontsize=20)
plt.legend(prop={'size': 20})
plt.show()
print(f'Max value for forecast is {fcst_24.max()}')
Max value for forecast is 64.89849208421569
'''
> - Step 4: We should evaluate the performance of the model comparing it to previous methods
'''
# only on the first 175 values because the last 24 are predictions associated with the "test" window
print('Train Data vs. yhat for first 175 periods RMSE: %.4f'% mean_squared_error(yhat[:175].values,df_train.values, squared=False))
print('Test Data vs. yhat last 24 periods RMSE: %.4f'% mean_squared_error(yhat.tail(24).values,df_test.values, squared=False))
Train Data vs. yhat for first 175 periods RMSE: 2.5935 Test Data vs. yhat last 24 periods RMSE: 1.6115
Jose Medina Observations¶
- We observe that our SARIMA model performs far better than our original ARIMA model (which only achieved 0.08 RMSE
- In this new model we observe
- Train: 2.59% RMSE vs. 8.1% in the original ARIMA model
- Test: 1.6% RMSE vs. 11.09% in the original ARIMA model
- Our SARIMA model performs better in Test than in train which is encouraging.
- We also observe that our 24 month forecast exhibits a logical repeating pattern that visually appears to match the general pattern of recent history
# What is the month to month growth in CO2 emissions if we just take the last 24 months in our data set as a proxiy?
# Let's normalize so we can characterize the slope as a %
df_test_norm = df_test/df_test.mean()
x = np.arange(df_test_norm.values.size)
fit = np.polyfit(x, df_test_norm.values, deg=1)
print ("Slope : " + str(fit[0]))
Slope : 0.010136005199044259
import ipywidgets as widgets
from ipywidgets import Layout, Button, Box, interact, interactive
def fn_calc_wind(starting_co2, wind_offset, wind_kwh):
#calculate monthly compounding effect of emissions
CO2_to_displace_in_tons = starting_co2.value * 1000000
# 1 metric ton = 1,000,000 grams. Therefore, a million metric tons is 1M * 1M grams
#calculate the displacement effect of Wind Turbine per month
wind_grams_displaced_per_month = wind_offset.value * wind_kwh.value
wind_tons_displaced_per_month = wind_grams_displaced_per_month/1000000
wind_turbines = round(CO2_to_displace_in_tons / wind_tons_displaced_per_month,0)
wind_site = wind_turbines / 250
print(f'Assumptions: you entered the following')
print(f'CO2_to_displace - in tons monthly: {CO2_to_displace_in_tons}')
print(f'With wind turbines capable of displacing {wind_tons_displaced_per_month} tons of CO2 every month')
print(f'We further assume that a wind site has on average 250 wind turbines')
print(f'------Result-----')
print(f'This implies we would need to have {int(wind_turbines)} new installed wind turbines to displace the monthly CO2 level')
print(f'This is equivalent to {int(wind_site)} new sites')
output = widgets.Output()
# What level of CO2 should we start at.
starting_co2 = widgets.IntSlider(
min=round(fcst_24.min(),0),
max=round(fcst_24.max()+15,0),
step=5,
description='Monthly CO2',
value=round(fcst_24.max(),0)
)
# Net Gram Offset
# The gram offset MAX is a factor of 1.5 to reflect improving output of Wind Technology
wind_offset_per_kwh = widgets.IntSlider(
min=250,
max=round(439*1.50,0),
step=5,
description='Wind CO2 Offset',
value=439
)
# Wind Turbine Killowatt Hours per Month
wind_kwh_per_month = widgets.IntSlider(
min=402000,
max=402000 * 2,
step=25000,
description='Wind kwh',
value=402000
)
def on_change(v):
x = v['new']
print('\nStarting level of CO2 emissions from Natural Gas')
starting_co2.observe(on_change, names='value')
display(starting_co2)
print('\nAssumed net grams of CO2 emissions offset by each kWh of Wind (start with 450 grams - 11 grams = 439 grams)')
print('Default: start with 450 grams for Natural Gas Emissions - 11 grams for Wind Turbines = 439 grams)')
wind_offset_per_kwh.observe(on_change, names='value')
display(wind_offset_per_kwh)
print('\nAssumed Monthly kwh output provided by each individual wind turbine')
wind_kwh_per_month.observe(on_change, names='value')
display(wind_kwh_per_month)
button = widgets.Button(description="Calculate")
display(button)
def on_button_clicked(b):
fn_calc_wind(starting_co2, wind_offset, wind_kwh)
button.on_click(on_button_clicked)
Starting level of CO2 emissions from Natural Gas
Assumed net grams of CO2 emissions offset by each kWh of Wind (start with 450 grams - 11 grams = 439 grams) Default: start with 450 grams for Natural Gas Emissions - 11 grams for Wind Turbines = 439 grams)
Assumed Monthly kwh output provided by each individual wind turbine
- Under base case, to fully offset the CO2 from natural gas emissions, we must add nearly 373K wind turbines yet we are far from that figure
- Annually, as a nation, we are adding ~3,000 wind turbines, a figure that is far behind what we need.
- Improving wind turbines technologies are enabling them to produce more energy by increasing sizes and hubheights, thereby improving the economics even further.
- Clearly the solution should be a mix of technologies - ranging from geothermal, to hydro, to wind, solar, hydrogen, and batteries
- Absent from our analysis is "coal" - though coal is being disintermediated by other sources of energy, primarily due to economics, there is still a significant amount of coal use in this country. Therefore, we will be recommending, in future analysis, to consider including analysis on incentives to replace coal as a fuel source, replacing it with renewables to dramatically reduce emissions.
- In this final element of our forecasting work we developed a useful predictive model using the SARIMA modelling approach
- In this new model we observe
- Train - 2.59% RMSE vs. 8.1% in the original ARIMA model
- Test - 1.6% RMSE vs. 11.09% in the original ARIMA model
- Our SARIMA model performs better in Test than in train which is encouraging.
- We also observe that our 24 month forecast exhibits a logical repeating pattern that visually appears to match the general pattern of recent history
- We used the model to develop a 24 month forecast. We then used the sample forecast to develop a "sensitivity" tool to evaluate the effecte of changing wind turbine technology, the amoung of CO2 they offset, versus the peak monthly level of CO2 emissions from Natural Gas Energy production in the next 24 months
Recommended Policy Investigations¶
- We recommend using this model, and the what-if tool, to explore alternative scenarios. The what-if tool can be expanded easily to include other scenarios and sensitivities. It could also be expanded to include visuals, and timeseries forecasts of alternative viable scenarios.
- We recommend exploring further expansion in incentives to drive increased adoptions of renewable energy sources.
- Our sensitivity has exposed a significant under run, for example, in the number of wind turbines requires to offset the CO2 emissions from expected peak monthly CO2 emissions levels over next 24 month period
- Incentives for renewable energy adoption can take on various forms - from Feed-In-Tariffs, to tax credits, to cap-and-trade schemes to socialize the costs of polluting entities and recognize the true cost of emissions.
- Any incentives must include viable regulatory processes to accelerate investment in the national electric grid infrastructure to make renewable energy, often sited in remote locations, capable of transmitting to load centers efficiently.
In future iterations of this work I recommend the following:¶
- Explore including the economic cost of implementing a variety of these policies to be able to evaluate the optimal allocation of a variety of investments in incentives to drive the desired renewable energy adoption behavior
- We could implement an optimization algorithm to find the least cost set of policies that set us on the lowest emissions course given a temperature reduction objective
- Explore the experience of various US states and European nations in implementing incentives, to attempt to predict the efficacy of such incentives in the United Status - either as federal regulatory programs, or state driven programs.
- From a modelling perspective, though our model achieved good levels of performance, we could explore using machine learning methods - such a N-BEATS and other deep learning frameworks to potentially improve the modelling even further.